Education, Science, Technology, Innovation and Life
Open Access
Sign In

Research on Diabetes Prediction Model Based on XGBoost Algorithm

Download as PDF

DOI: 10.23977/icamcs2019.47

Author(s)

Wu Hao

Corresponding Author

Wu Hao

ABSTRACT

To explore the role of XGBoost algorithm in predicting the risk of diabetes mellitus. Pima Indian Diabetes data set in UCI machine learning database was selected and 70% of the samples were randomly selected. The plasma glucose concentration, diastolic blood pressure (mm Hg), triceps skinfold thickness (mm), 2-hour serum insulin (mu U/ml), body mass index (kg/m2), diabetic family function value and age were taken as eight factors after pregnancy, oral glucose tolerance test for 2 hours. Independent variable, with diabetes as dependent variable, based on Logistic regression and XGBoost, diabetes prediction models were established respectively. The prediction model is applied to the remaining 30% samples to evaluate the prediction effect of the model with the correct rate. The correct rates of Logistic regression model and XGBoost model were 77% and 83%, respectively. XGboost has better prediction accuracy than traditional Logistic regression.

KEYWORDS

Diabetes, XGBoost, Forecast

All published work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright © 2016 - 2031 Clausius Scientific Press Inc. All Rights Reserved.