Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm

ObjectiveTo develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting gestational diabetes mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.MethodsA case–control study was carried out among pregnant women, who were assigned...

Full description

Bibliographic Details
Main Authors: Xiaoqi Hu, Xiaolin Hu, Ya Yu, Jia Wang
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-03-01
Series:Frontiers in Endocrinology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fendo.2023.1105062/full
_version_ 1827996417880227840
author Xiaoqi Hu
Xiaolin Hu
Ya Yu
Jia Wang
author_facet Xiaoqi Hu
Xiaolin Hu
Ya Yu
Jia Wang
author_sort Xiaoqi Hu
collection DOAJ
description ObjectiveTo develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting gestational diabetes mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.MethodsA case–control study was carried out among pregnant women, who were assigned to either the training set (these women were recruited from August 2019 to November 2019) or the testing set (these women were recruited in August 2020). We applied the XG Boost ML model approach to identify the best set of predictors out of a set of 33 variables. The performance of the prediction model was determined by using the area under the receiver operating characteristic (ROC) curve (AUC) to assess discrimination, and the Hosmer–Lemeshow (HL) test and calibration plots to assess calibration. Decision curve analysis (DCA) was introduced to evaluate the clinical use of each of the models.ResultsA total of 735 and 190 pregnant women were included in the training and testing sets, respectively. The XG Boost ML model, which included 20 predictors, resulted in an AUC of 0.946 and yielded a predictive accuracy of 0.875, whereas the model using a traditional LR included four predictors and presented an AUC of 0.752 and yielded a predictive accuracy of 0.786. The HL test and calibration plots show that the two models have good calibration. DCA indicated that treating only those women whom the XG Boost ML model predicts are at risk of GDM confers a net benefit compared with treating all women or treating none.ConclusionsThe established model using XG Boost ML showed better predictive ability than the traditional LR model in terms of discrimination. The calibration performance of both models was good.
first_indexed 2024-04-10T05:11:54Z
format Article
id doaj.art-bcca8970ad4f47f4902fa9db50e4e58e
institution Directory Open Access Journal
issn 1664-2392
language English
last_indexed 2024-04-10T05:11:54Z
publishDate 2023-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Endocrinology
spelling doaj.art-bcca8970ad4f47f4902fa9db50e4e58e2023-03-09T07:17:32ZengFrontiers Media S.A.Frontiers in Endocrinology1664-23922023-03-011410.3389/fendo.2023.11050621105062Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithmXiaoqi Hu0Xiaolin Hu1Ya Yu2Jia Wang3Department of Nursing, Yantian District People's Hospital, Shenzhen, Guangdong, ChinaSchool of Basic Medical Sciences, Southern Medical University, Guangzhou, Guangdong, ChinaDepartment of Nursing, Guangzhou First People's Hospital, Guangzhou, Guangdong, ChinaDepartment of Nursing, Shenzhen Hospital of Southern Medical University, Shenzhen, Guangdong, ChinaObjectiveTo develop the extreme gradient boosting (XG Boost) machine learning (ML) model for predicting gestational diabetes mellitus (GDM) compared with a model using the traditional logistic regression (LR) method.MethodsA case–control study was carried out among pregnant women, who were assigned to either the training set (these women were recruited from August 2019 to November 2019) or the testing set (these women were recruited in August 2020). We applied the XG Boost ML model approach to identify the best set of predictors out of a set of 33 variables. The performance of the prediction model was determined by using the area under the receiver operating characteristic (ROC) curve (AUC) to assess discrimination, and the Hosmer–Lemeshow (HL) test and calibration plots to assess calibration. Decision curve analysis (DCA) was introduced to evaluate the clinical use of each of the models.ResultsA total of 735 and 190 pregnant women were included in the training and testing sets, respectively. The XG Boost ML model, which included 20 predictors, resulted in an AUC of 0.946 and yielded a predictive accuracy of 0.875, whereas the model using a traditional LR included four predictors and presented an AUC of 0.752 and yielded a predictive accuracy of 0.786. The HL test and calibration plots show that the two models have good calibration. DCA indicated that treating only those women whom the XG Boost ML model predicts are at risk of GDM confers a net benefit compared with treating all women or treating none.ConclusionsThe established model using XG Boost ML showed better predictive ability than the traditional LR model in terms of discrimination. The calibration performance of both models was good.https://www.frontiersin.org/articles/10.3389/fendo.2023.1105062/fullgestational diabetes mellitusmachine learningprediction modelextreme gradient boostinglogistic regression
spellingShingle Xiaoqi Hu
Xiaolin Hu
Ya Yu
Jia Wang
Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
Frontiers in Endocrinology
gestational diabetes mellitus
machine learning
prediction model
extreme gradient boosting
logistic regression
title Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
title_full Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
title_fullStr Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
title_full_unstemmed Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
title_short Prediction model for gestational diabetes mellitus using the XG Boost machine learning algorithm
title_sort prediction model for gestational diabetes mellitus using the xg boost machine learning algorithm
topic gestational diabetes mellitus
machine learning
prediction model
extreme gradient boosting
logistic regression
url https://www.frontiersin.org/articles/10.3389/fendo.2023.1105062/full
work_keys_str_mv AT xiaoqihu predictionmodelforgestationaldiabetesmellitususingthexgboostmachinelearningalgorithm
AT xiaolinhu predictionmodelforgestationaldiabetesmellitususingthexgboostmachinelearningalgorithm
AT yayu predictionmodelforgestationaldiabetesmellitususingthexgboostmachinelearningalgorithm
AT jiawang predictionmodelforgestationaldiabetesmellitususingthexgboostmachinelearningalgorithm