Hospital Readmission Risk Prediction using Machine Learning
Program: Data Science Master's Degree
Location: Not Specified (remote)
Student: Jayanth Yerragudi
Hospital readmissions, especially among patients living with diabetes, remain a persistent challenge for health systems because they carry both clinical risks and financial consequences. (Why do patients keep coming back? Results of a Readmitted Patient Survey, n.d.). Using a dataset that includes patient demographics, clinical information, medications, and prior visits, this report examines whether machine learning models can help identify which patients are most likely to get readmitted within 30 days of discharge. I compared several classification approaches, including Logistic Regression, Random Forest, and LightGBM, and created additional features to capture meaningful clinical patterns. Because the number of readmissions in the dataset is relatively small, techniques such as stratified sampling and SMOTE are used to maintain balance during training. Among the models tested, LightGBM demonstrates the strongest ability to distinguish between high-risk and low-risk patients. This report walks through the modeling workflow, highlights the key factors that influence readmission risk, and discusses how these findings could support clinicians in making more proactive care decisions. This project aimed to: (1) reduce avoidable 30-day readmissions by identifying patients most likely to return, (2) build transparent, interpretable models with reliable metrics such as ROC-AUC and AUPRC, and (3) provide actionable insights through key predictive features and practical recommendations.