EDUPREDICT: A SOCIO-ECONOMIC AND ACADEMIC INTEGRATED MACHINE LEARNING FRAMEWORK FOR STUDENT SUCCESS PREDICTION

Authors

  • Soujanya, Thanishka, Dr. Jeevan Pinto

DOI:

https://doi.org/10.25215/8194288797.33

Abstract

EduPredict is a smart and explainable machine learning system designed to predict student outcomes — whether a student is likely to Drop out, Stay Enrolled, or Graduate. It combines different types of information such as academic records (like grades, credits, and pass rates), family and social background (such as parents’ jobs and household income), and economic factors (like unemployment and GDP) to understand what influences a student’s success.The system uses three main machine learning models: LightGBM, XGBoost, and Logistic Regression. Among them, XGBoost performed the best with an accuracy of 94.46%, followed by LightGBM at 85.35%, and Logistic Regression at 83.34%. To handle data imbalance, the SMOTE technique was applied during model training, and SHAP analysis was used to explain which features had the most impact on predictions.Results showed that academic performance, parental occupation, and regional economic conditions were the strongest factors affecting student outcomes. The system found it hardest to accurately predict students who are still Enrolled, which means more data or fine-tuning may be needed for that group. Overall, EduPredict+ provides valuable, easy-to-interpret insights that can help educators and policymakers identify at-risk students early, plan interventions, and make better decisions to improve student success rates.

Published

2026-03-13