A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease

Background: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants,...

Full description

Bibliographic Details
Main Authors: Aikeliyaer Ainiwaer, Wen Qing Hou, Kaisaierjiang Kadier, Rena Rehemuding, Peng Fei Liu, Halimulati Maimaiti, Lian Qin, Xiang Ma, Jian Guo Dai
Format: Article
Language:English
Published: IMR Press 2023-06-01
Series:Reviews in Cardiovascular Medicine
Subjects:
Online Access:https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168
_version_ 1827913787144929280
author Aikeliyaer Ainiwaer
Wen Qing Hou
Kaisaierjiang Kadier
Rena Rehemuding
Peng Fei Liu
Halimulati Maimaiti
Lian Qin
Xiang Ma
Jian Guo Dai
author_facet Aikeliyaer Ainiwaer
Wen Qing Hou
Kaisaierjiang Kadier
Rena Rehemuding
Peng Fei Liu
Halimulati Maimaiti
Lian Qin
Xiang Ma
Jian Guo Dai
author_sort Aikeliyaer Ainiwaer
collection DOAJ
description Background: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants, while the test set included 116 prospectively enrolled participants from whom we collected 53 baseline variables and coronary angiography results. The data was pre-processed with outlier processing and One-Hot coding. In the first stage, we constructed a ML model that used baseline information to predict the presence of CAD with a dichotomous model. In the second stage, baseline information was used to construct ML regression models for predicting the severity of CAD. The non-CAD population was included, and two different scores were used as output variables. Finally, statistical analysis and SHAP plot visualization methods were employed to explore the relationship between baseline information and CAD. Results: The study included 269 CAD patients and 131 healthy controls. The eXtreme Gradient Boosting (XGBoost) model exhibited the best performance amongst the different models for predicting CAD, with an area under the receiver operating characteristic curve of 0.728 (95% CI 0.623–0.824). The main correlates were left ventricular ejection fraction, homocysteine, and hemoglobin (p < 0.001). The XGBoost model performed best for predicting the SYNTAX score, with the main correlates being brain natriuretic peptide (BNP), left ventricular ejection fraction, and glycated hemoglobin (p < 0.001). The main relevant features in the model predictive for the GENSINI score were BNP, high density lipoprotein, and homocysteine (p < 0.001). Conclusions: This data-driven approach provides a foundation for the risk stratification and severity assessment of CAD. Clinical Trial Registration: The study was registered in www.clinicaltrials.gov protocol registration system (number NCT05018715).
first_indexed 2024-03-13T02:35:53Z
format Article
id doaj.art-047dffa4a5954c8394e4abede73a8807
institution Directory Open Access Journal
issn 1530-6550
language English
last_indexed 2024-03-13T02:35:53Z
publishDate 2023-06-01
publisher IMR Press
record_format Article
series Reviews in Cardiovascular Medicine
spelling doaj.art-047dffa4a5954c8394e4abede73a88072023-06-29T02:06:46ZengIMR PressReviews in Cardiovascular Medicine1530-65502023-06-0124616810.31083/j.rcm2406168S1530-6550(23)00935-3A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery DiseaseAikeliyaer Ainiwaer0Wen Qing Hou1Kaisaierjiang Kadier2Rena Rehemuding3Peng Fei Liu4Halimulati Maimaiti5Lian Qin6Xiang Ma7Jian Guo Dai8Department of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaCollege of Information Science and Technology, Shihezi University, 832003 Shihezi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaCollege of Information Science and Technology, Shihezi University, 832003 Shihezi, Xinjiang, ChinaBackground: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants, while the test set included 116 prospectively enrolled participants from whom we collected 53 baseline variables and coronary angiography results. The data was pre-processed with outlier processing and One-Hot coding. In the first stage, we constructed a ML model that used baseline information to predict the presence of CAD with a dichotomous model. In the second stage, baseline information was used to construct ML regression models for predicting the severity of CAD. The non-CAD population was included, and two different scores were used as output variables. Finally, statistical analysis and SHAP plot visualization methods were employed to explore the relationship between baseline information and CAD. Results: The study included 269 CAD patients and 131 healthy controls. The eXtreme Gradient Boosting (XGBoost) model exhibited the best performance amongst the different models for predicting CAD, with an area under the receiver operating characteristic curve of 0.728 (95% CI 0.623–0.824). The main correlates were left ventricular ejection fraction, homocysteine, and hemoglobin (p < 0.001). The XGBoost model performed best for predicting the SYNTAX score, with the main correlates being brain natriuretic peptide (BNP), left ventricular ejection fraction, and glycated hemoglobin (p < 0.001). The main relevant features in the model predictive for the GENSINI score were BNP, high density lipoprotein, and homocysteine (p < 0.001). Conclusions: This data-driven approach provides a foundation for the risk stratification and severity assessment of CAD. Clinical Trial Registration: The study was registered in www.clinicaltrials.gov protocol registration system (number NCT05018715).https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168machine learningcoronary artery diseasesyntax scoregensini score
spellingShingle Aikeliyaer Ainiwaer
Wen Qing Hou
Kaisaierjiang Kadier
Rena Rehemuding
Peng Fei Liu
Halimulati Maimaiti
Lian Qin
Xiang Ma
Jian Guo Dai
A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
Reviews in Cardiovascular Medicine
machine learning
coronary artery disease
syntax score
gensini score
title A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
title_full A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
title_fullStr A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
title_full_unstemmed A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
title_short A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
title_sort machine learning framework for diagnosing and predicting the severity of coronary artery disease
topic machine learning
coronary artery disease
syntax score
gensini score
url https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168
work_keys_str_mv AT aikeliyaerainiwaer amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT wenqinghou amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT kaisaierjiangkadier amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT renarehemuding amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT pengfeiliu amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT halimulatimaimaiti amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT lianqin amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT xiangma amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT jianguodai amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT aikeliyaerainiwaer machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT wenqinghou machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT kaisaierjiangkadier machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT renarehemuding machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT pengfeiliu machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT halimulatimaimaiti machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT lianqin machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT xiangma machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease
AT jianguodai machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease