PREDICTING DIABETES PROGRESSION

Authors

  • Preethesh Shetty, Avil Noronha, Dr. Ruben

DOI:

https://doi.org/10.25215/8194288797.22

Abstract

Diabetes mellitus is a long-term metabolic disorder that affects millions of people around the world. It is a serious health concern that requires early detection and timely care to avoid severe complications. In this study, the main goal is to build an effective machine learning model that can predict the chances of developing diabetes. For this purpose, we used the Pima Indians Diabetes Dataset, which is available on Kaggle. This dataset contains important health-related information such as glucose levels, body mass index (BMI), insulin concentration, and age. Before training the models, the data was ca6refully cleaned and preprocessed by normalizing values and handling missing records to improve accuracy and consistency. To find the best performing model, we tested several supervised learning algorithms, including Logistic Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), and XGBoost. These models were compared based on different evaluation metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. Among all the models, the XGBoost algorithm showed the best performance, achieving the highest accuracy and overall reliability. It was able to make predictions more efficiently and handle complex patterns in the dataset better than the other models. Through feature importance analysis, we found that glucose level, BMI, and age were the most influential factors in predicting diabetes. This shows that these health parameters play a major role in determining the risk of diabetes progression. The overall findings of this research highlight that machine learning, especially ensemble-based models like XGBoost, can be a powerful tool for early diabetes detection. Such systems can support doctors and healthcare professionals in identifying at-risk individuals and help design personalized treatment and prevention strategies for better patient care.

Published

2026-03-13