A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease
Background: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants,...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IMR Press
2023-06-01
|
Series: | Reviews in Cardiovascular Medicine |
Subjects: | |
Online Access: | https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168 |
_version_ | 1827913787144929280 |
---|---|
author | Aikeliyaer Ainiwaer Wen Qing Hou Kaisaierjiang Kadier Rena Rehemuding Peng Fei Liu Halimulati Maimaiti Lian Qin Xiang Ma Jian Guo Dai |
author_facet | Aikeliyaer Ainiwaer Wen Qing Hou Kaisaierjiang Kadier Rena Rehemuding Peng Fei Liu Halimulati Maimaiti Lian Qin Xiang Ma Jian Guo Dai |
author_sort | Aikeliyaer Ainiwaer |
collection | DOAJ |
description | Background: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants, while the test set included 116 prospectively enrolled participants from whom we collected 53 baseline variables and coronary angiography results. The data was pre-processed with outlier processing and One-Hot coding. In the first stage, we constructed a ML model that used baseline information to predict the presence of CAD with a dichotomous model. In the second stage, baseline information was used to construct ML regression models for predicting the severity of CAD. The non-CAD population was included, and two different scores were used as output variables. Finally, statistical analysis and SHAP plot visualization methods were employed to explore the relationship between baseline information and CAD. Results: The study included 269 CAD patients and 131 healthy controls. The eXtreme Gradient Boosting (XGBoost) model exhibited the best performance amongst the different models for predicting CAD, with an area under the receiver operating characteristic curve of 0.728 (95% CI 0.623–0.824). The main correlates were left ventricular ejection fraction, homocysteine, and hemoglobin (p < 0.001). The XGBoost model performed best for predicting the SYNTAX score, with the main correlates being brain natriuretic peptide (BNP), left ventricular ejection fraction, and glycated hemoglobin (p < 0.001). The main relevant features in the model predictive for the GENSINI score were BNP, high density lipoprotein, and homocysteine (p < 0.001). Conclusions: This data-driven approach provides a foundation for the risk stratification and severity assessment of CAD. Clinical Trial Registration: The study was registered in www.clinicaltrials.gov protocol registration system (number NCT05018715). |
first_indexed | 2024-03-13T02:35:53Z |
format | Article |
id | doaj.art-047dffa4a5954c8394e4abede73a8807 |
institution | Directory Open Access Journal |
issn | 1530-6550 |
language | English |
last_indexed | 2024-03-13T02:35:53Z |
publishDate | 2023-06-01 |
publisher | IMR Press |
record_format | Article |
series | Reviews in Cardiovascular Medicine |
spelling | doaj.art-047dffa4a5954c8394e4abede73a88072023-06-29T02:06:46ZengIMR PressReviews in Cardiovascular Medicine1530-65502023-06-0124616810.31083/j.rcm2406168S1530-6550(23)00935-3A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery DiseaseAikeliyaer Ainiwaer0Wen Qing Hou1Kaisaierjiang Kadier2Rena Rehemuding3Peng Fei Liu4Halimulati Maimaiti5Lian Qin6Xiang Ma7Jian Guo Dai8Department of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaCollege of Information Science and Technology, Shihezi University, 832003 Shihezi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaDepartment of Cardiology, The First Affiliated Hospital of Xinjiang Medical University, 830011 Urumqi, Xinjiang, ChinaCollege of Information Science and Technology, Shihezi University, 832003 Shihezi, Xinjiang, ChinaBackground: Although machine learning (ML)-based prediction of coronary artery disease (CAD) has gained increasing attention, assessment of the severity of suspected CAD in symptomatic patients remains challenging. Methods: The training set for this study consisted of 284 retrospective participants, while the test set included 116 prospectively enrolled participants from whom we collected 53 baseline variables and coronary angiography results. The data was pre-processed with outlier processing and One-Hot coding. In the first stage, we constructed a ML model that used baseline information to predict the presence of CAD with a dichotomous model. In the second stage, baseline information was used to construct ML regression models for predicting the severity of CAD. The non-CAD population was included, and two different scores were used as output variables. Finally, statistical analysis and SHAP plot visualization methods were employed to explore the relationship between baseline information and CAD. Results: The study included 269 CAD patients and 131 healthy controls. The eXtreme Gradient Boosting (XGBoost) model exhibited the best performance amongst the different models for predicting CAD, with an area under the receiver operating characteristic curve of 0.728 (95% CI 0.623–0.824). The main correlates were left ventricular ejection fraction, homocysteine, and hemoglobin (p < 0.001). The XGBoost model performed best for predicting the SYNTAX score, with the main correlates being brain natriuretic peptide (BNP), left ventricular ejection fraction, and glycated hemoglobin (p < 0.001). The main relevant features in the model predictive for the GENSINI score were BNP, high density lipoprotein, and homocysteine (p < 0.001). Conclusions: This data-driven approach provides a foundation for the risk stratification and severity assessment of CAD. Clinical Trial Registration: The study was registered in www.clinicaltrials.gov protocol registration system (number NCT05018715).https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168machine learningcoronary artery diseasesyntax scoregensini score |
spellingShingle | Aikeliyaer Ainiwaer Wen Qing Hou Kaisaierjiang Kadier Rena Rehemuding Peng Fei Liu Halimulati Maimaiti Lian Qin Xiang Ma Jian Guo Dai A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease Reviews in Cardiovascular Medicine machine learning coronary artery disease syntax score gensini score |
title | A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease |
title_full | A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease |
title_fullStr | A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease |
title_full_unstemmed | A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease |
title_short | A Machine Learning Framework for Diagnosing and Predicting the Severity of Coronary Artery Disease |
title_sort | machine learning framework for diagnosing and predicting the severity of coronary artery disease |
topic | machine learning coronary artery disease syntax score gensini score |
url | https://www.imrpress.com/journal/RCM/24/6/10.31083/j.rcm2406168 |
work_keys_str_mv | AT aikeliyaerainiwaer amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT wenqinghou amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT kaisaierjiangkadier amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT renarehemuding amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT pengfeiliu amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT halimulatimaimaiti amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT lianqin amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT xiangma amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT jianguodai amachinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT aikeliyaerainiwaer machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT wenqinghou machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT kaisaierjiangkadier machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT renarehemuding machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT pengfeiliu machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT halimulatimaimaiti machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT lianqin machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT xiangma machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease AT jianguodai machinelearningframeworkfordiagnosingandpredictingtheseverityofcoronaryarterydisease |