An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis
Abstract Early identification of high-risk metabolic dysfunction-associated steatohepatitis (MASH) can offer patients access to novel therapeutic options and potentially decrease the risk of progression to cirrhosis. This study aimed to develop an explainable machine learning model for high-risk MAS...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-04-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-024-59183-4 |
_version_ | 1797209445657214976 |
---|---|
author | Basile Njei Eri Osta Nelvis Njei Yazan A. Al-Ajlouni Joseph K. Lim |
author_facet | Basile Njei Eri Osta Nelvis Njei Yazan A. Al-Ajlouni Joseph K. Lim |
author_sort | Basile Njei |
collection | DOAJ |
description | Abstract Early identification of high-risk metabolic dysfunction-associated steatohepatitis (MASH) can offer patients access to novel therapeutic options and potentially decrease the risk of progression to cirrhosis. This study aimed to develop an explainable machine learning model for high-risk MASH prediction and compare its performance with well-established biomarkers. Data were derived from the National Health and Nutrition Examination Surveys (NHANES) 2017-March 2020, which included a total of 5281 adults with valid elastography measurements. We used a FAST score ≥ 0.35, calculated using liver stiffness measurement and controlled attenuation parameter values and aspartate aminotransferase levels, to identify individuals with high-risk MASH. We developed an ensemble-based machine learning XGBoost model to detect high-risk MASH and explored the model’s interpretability using an explainable artificial intelligence SHAP method. The prevalence of high-risk MASH was 6.9%. Our XGBoost model achieved a high level of sensitivity (0.82), specificity (0.91), accuracy (0.90), and AUC (0.95) for identifying high-risk MASH. Our model demonstrated a superior ability to predict high-risk MASH vs. FIB-4, APRI, BARD, and MASLD fibrosis scores (AUC of 0.95 vs. 0.50, 0.50, 0.49 and 0.50, respectively). To explain the high performance of our model, we found that the top 5 predictors of high-risk MASH were ALT, GGT, platelet count, waist circumference, and age. We used an explainable ML approach to develop a clinically applicable model that outperforms commonly used clinical risk indices and could increase the identification of high-risk MASH patients in resource-limited settings. |
first_indexed | 2024-04-24T09:54:49Z |
format | Article |
id | doaj.art-e1e312dd308649e99ddfa84b73b8c4b1 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-24T09:54:49Z |
publishDate | 2024-04-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-e1e312dd308649e99ddfa84b73b8c4b12024-04-14T11:13:33ZengNature PortfolioScientific Reports2045-23222024-04-011411910.1038/s41598-024-59183-4An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitisBasile Njei0Eri Osta1Nelvis Njei2Yazan A. Al-Ajlouni3Joseph K. Lim4Section of Digestive Diseases, Yale School of MedicineUniversity of Texas Health San AntonioCenters for Medicare and Medicaid ServicesSchool of Medicine, New York Medical CollegeSection of Digestive Diseases, Yale School of MedicineAbstract Early identification of high-risk metabolic dysfunction-associated steatohepatitis (MASH) can offer patients access to novel therapeutic options and potentially decrease the risk of progression to cirrhosis. This study aimed to develop an explainable machine learning model for high-risk MASH prediction and compare its performance with well-established biomarkers. Data were derived from the National Health and Nutrition Examination Surveys (NHANES) 2017-March 2020, which included a total of 5281 adults with valid elastography measurements. We used a FAST score ≥ 0.35, calculated using liver stiffness measurement and controlled attenuation parameter values and aspartate aminotransferase levels, to identify individuals with high-risk MASH. We developed an ensemble-based machine learning XGBoost model to detect high-risk MASH and explored the model’s interpretability using an explainable artificial intelligence SHAP method. The prevalence of high-risk MASH was 6.9%. Our XGBoost model achieved a high level of sensitivity (0.82), specificity (0.91), accuracy (0.90), and AUC (0.95) for identifying high-risk MASH. Our model demonstrated a superior ability to predict high-risk MASH vs. FIB-4, APRI, BARD, and MASLD fibrosis scores (AUC of 0.95 vs. 0.50, 0.50, 0.49 and 0.50, respectively). To explain the high performance of our model, we found that the top 5 predictors of high-risk MASH were ALT, GGT, platelet count, waist circumference, and age. We used an explainable ML approach to develop a clinically applicable model that outperforms commonly used clinical risk indices and could increase the identification of high-risk MASH patients in resource-limited settings.https://doi.org/10.1038/s41598-024-59183-4 |
spellingShingle | Basile Njei Eri Osta Nelvis Njei Yazan A. Al-Ajlouni Joseph K. Lim An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis Scientific Reports |
title | An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis |
title_full | An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis |
title_fullStr | An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis |
title_full_unstemmed | An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis |
title_short | An explainable machine learning model for prediction of high-risk nonalcoholic steatohepatitis |
title_sort | explainable machine learning model for prediction of high risk nonalcoholic steatohepatitis |
url | https://doi.org/10.1038/s41598-024-59183-4 |
work_keys_str_mv | AT basilenjei anexplainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT eriosta anexplainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT nelvisnjei anexplainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT yazanaalajlouni anexplainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT josephklim anexplainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT basilenjei explainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT eriosta explainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT nelvisnjei explainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT yazanaalajlouni explainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis AT josephklim explainablemachinelearningmodelforpredictionofhighrisknonalcoholicsteatohepatitis |