Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study

ObjectiveDiabetic kidney disease (DKD) has been reported as a main microvascular complication of diabetes mellitus. Although renal biopsy is capable of distinguishing DKD from Non Diabetic kidney disease(NDKD), no gold standard has been validated to assess the development of DKD.This study aimed to...

Full description

Bibliographic Details
Main Authors:	Xiao zhu Liu, Minjie Duan, Hao dong Huang, Yang Zhang, Tian yu Xiang, Wu ceng Niu, Bei Zhou, Hao lin Wang, Ting ting Zhang
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2023-07-01
Series:	Frontiers in Endocrinology
Subjects:	type 2 diabetes mellitus diabetic kidney disease machine learning prediction CatBoost model
Online Access:	https://www.frontiersin.org/articles/10.3389/fendo.2023.1184190/full

_version_	1797787758490550272
author	Xiao zhu Liu Xiao zhu Liu Minjie Duan Minjie Duan Hao dong Huang Hao dong Huang Yang Zhang Yang Zhang Tian yu Xiang Wu ceng Niu Bei Zhou Hao lin Wang Ting ting Zhang
author_facet	Xiao zhu Liu Xiao zhu Liu Minjie Duan Minjie Duan Hao dong Huang Hao dong Huang Yang Zhang Yang Zhang Tian yu Xiang Wu ceng Niu Bei Zhou Hao lin Wang Ting ting Zhang
author_sort	Xiao zhu Liu
collection	DOAJ
description	ObjectiveDiabetic kidney disease (DKD) has been reported as a main microvascular complication of diabetes mellitus. Although renal biopsy is capable of distinguishing DKD from Non Diabetic kidney disease(NDKD), no gold standard has been validated to assess the development of DKD.This study aimed to build an auxiliary diagnosis model for type 2 Diabetic kidney disease (T2DKD) based on machine learning algorithms.MethodsClinical data on 3624 individuals with type 2 diabetes (T2DM) was gathered from January 1, 2019 to December 31, 2019 using a multi-center retrospective database. The data fell into a training set and a validation set at random at a ratio of 8:2. To identify critical clinical variables, the absolute shrinkage and selection operator with the lowest number was employed. Fifteen machine learning models were built to support the diagnosis of T2DKD, and the optimal model was selected in accordance with the area under the receiver operating characteristic curve (AUC) and accuracy. The model was improved with the use of Bayesian Optimization methods. The Shapley Additive explanations (SHAP) approach was used to illustrate prediction findings.ResultsDKD was diagnosed in 1856 (51.2 percent) of the 3624 individuals within the final cohort. As revealed by the SHAP findings, the Categorical Boosting (CatBoost) model achieved the optimal performance 1in the prediction of the risk of T2DKD, with an AUC of 0.86 based on the top 38 characteristics. The SHAP findings suggested that a simplified CatBoost model with an AUC of 0.84 was built in accordance with the top 12 characteristics. The more basic model features consisted of systolic blood pressure (SBP), creatinine (CREA), length of stay (LOS), thrombin time (TT), Age, prothrombin time (PT), platelet large cell ratio (P-LCR), albumin (ALB), glucose (GLU), fibrinogen (FIB-C), red blood cell distribution width-standard deviation (RDW-SD), as well as hemoglobin A1C(HbA1C).ConclusionA machine learning-based model for the prediction of the risk of developing T2DKD was built, and its effectiveness was verified. The CatBoost model can contribute to the diagnosis of T2DKD. Clinicians could gain more insights into the outcomes if the ML model is made interpretable.
first_indexed	2024-03-13T01:27:21Z
format	Article
id	doaj.art-a1baf81fab2f43abbfb9a7ebeb07c0ac
institution	Directory Open Access Journal
issn	1664-2392
language	English
last_indexed	2024-03-13T01:27:21Z
publishDate	2023-07-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Endocrinology
spelling	doaj.art-a1baf81fab2f43abbfb9a7ebeb07c0ac2023-07-04T13:43:19ZengFrontiers Media S.A.Frontiers in Endocrinology1664-23922023-07-011410.3389/fendo.2023.11841901184190Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective studyXiao zhu Liu0Xiao zhu Liu1Minjie Duan2Minjie Duan3Hao dong Huang4Hao dong Huang5Yang Zhang6Yang Zhang7Tian yu Xiang8Wu ceng Niu9Bei Zhou10Hao lin Wang11Ting ting Zhang12Department of Cardiology, the Second Affiliated Hospital of Chongqing Medical University, Chongqing, ChinaMedical Data Science Academy, Chongqing Medical University, Chongqing, ChinaMedical Data Science Academy, Chongqing Medical University, Chongqing, ChinaCollege of Medical Informatics, Chongqing Medical University, Chongqing, ChinaMedical Data Science Academy, Chongqing Medical University, Chongqing, ChinaCollege of Medical Informatics, Chongqing Medical University, Chongqing, ChinaMedical Data Science Academy, Chongqing Medical University, Chongqing, ChinaCollege of Medical Informatics, Chongqing Medical University, Chongqing, ChinaInformation Center, The University-Town Hospital of Chongqing Medical University, Chongqing, ChinaDepartment of Nuclear Medicine, Handan First Hospital, Hebei, ChinaDepartment of Cardiology, the Second Affiliated Hospital of Chongqing Medical University, Chongqing, ChinaCollege of Medical Informatics, Chongqing Medical University, Chongqing, ChinaDepartment of Endocrinology, Fifth Medical Center of Chinese People's Liberation Army (PLA) Hospital, Beijing, ChinaObjectiveDiabetic kidney disease (DKD) has been reported as a main microvascular complication of diabetes mellitus. Although renal biopsy is capable of distinguishing DKD from Non Diabetic kidney disease(NDKD), no gold standard has been validated to assess the development of DKD.This study aimed to build an auxiliary diagnosis model for type 2 Diabetic kidney disease (T2DKD) based on machine learning algorithms.MethodsClinical data on 3624 individuals with type 2 diabetes (T2DM) was gathered from January 1, 2019 to December 31, 2019 using a multi-center retrospective database. The data fell into a training set and a validation set at random at a ratio of 8:2. To identify critical clinical variables, the absolute shrinkage and selection operator with the lowest number was employed. Fifteen machine learning models were built to support the diagnosis of T2DKD, and the optimal model was selected in accordance with the area under the receiver operating characteristic curve (AUC) and accuracy. The model was improved with the use of Bayesian Optimization methods. The Shapley Additive explanations (SHAP) approach was used to illustrate prediction findings.ResultsDKD was diagnosed in 1856 (51.2 percent) of the 3624 individuals within the final cohort. As revealed by the SHAP findings, the Categorical Boosting (CatBoost) model achieved the optimal performance 1in the prediction of the risk of T2DKD, with an AUC of 0.86 based on the top 38 characteristics. The SHAP findings suggested that a simplified CatBoost model with an AUC of 0.84 was built in accordance with the top 12 characteristics. The more basic model features consisted of systolic blood pressure (SBP), creatinine (CREA), length of stay (LOS), thrombin time (TT), Age, prothrombin time (PT), platelet large cell ratio (P-LCR), albumin (ALB), glucose (GLU), fibrinogen (FIB-C), red blood cell distribution width-standard deviation (RDW-SD), as well as hemoglobin A1C(HbA1C).ConclusionA machine learning-based model for the prediction of the risk of developing T2DKD was built, and its effectiveness was verified. The CatBoost model can contribute to the diagnosis of T2DKD. Clinicians could gain more insights into the outcomes if the ML model is made interpretable.https://www.frontiersin.org/articles/10.3389/fendo.2023.1184190/fulltype 2 diabetes mellitusdiabetic kidney diseasemachine learningpredictionCatBoost model
spellingShingle	Xiao zhu Liu Xiao zhu Liu Minjie Duan Minjie Duan Hao dong Huang Hao dong Huang Yang Zhang Yang Zhang Tian yu Xiang Wu ceng Niu Bei Zhou Hao lin Wang Ting ting Zhang Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study Frontiers in Endocrinology type 2 diabetes mellitus diabetic kidney disease machine learning prediction CatBoost model
title	Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study
title_full	Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study
title_fullStr	Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study
title_full_unstemmed	Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study
title_short	Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study
title_sort	predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world a multicenter retrospective study
topic	type 2 diabetes mellitus diabetic kidney disease machine learning prediction CatBoost model
url	https://www.frontiersin.org/articles/10.3389/fendo.2023.1184190/full
work_keys_str_mv	AT xiaozhuliu predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT xiaozhuliu predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT minjieduan predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT minjieduan predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT haodonghuang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT haodonghuang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT yangzhang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT yangzhang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT tianyuxiang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT wucengniu predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT beizhou predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT haolinwang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy AT tingtingzhang predictingdiabetickidneydiseasefortype2diabetesmellitusbymachinelearningintherealworldamulticenterretrospectivestudy

Predicting diabetic kidney disease for type 2 diabetes mellitus by machine learning in the real world: a multicenter retrospective study

Similar Items