Machine learning models to predict disease progression among veterans with hepatitis C virus.
<h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models...
Main Authors: | , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2019-01-01
|
Series: | PLoS ONE |
Online Access: | https://doi.org/10.1371/journal.pone.0208141 |
_version_ | 1798024961446641664 |
---|---|
author | Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee |
author_facet | Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee |
author_sort | Monica A Konerman |
collection | DOAJ |
description | <h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data.<h4>Methods and findings</h4>We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000-2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC.<h4>Conclusions</h4>Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states. |
first_indexed | 2024-04-11T18:12:19Z |
format | Article |
id | doaj.art-b76fb559e94243198753ba5237f302c7 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-04-11T18:12:19Z |
publishDate | 2019-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-b76fb559e94243198753ba5237f302c72022-12-22T04:10:06ZengPublic Library of Science (PLoS)PLoS ONE1932-62032019-01-01141e020814110.1371/journal.pone.0208141Machine learning models to predict disease progression among veterans with hepatitis C virus.Monica A KonermanLauren A BesteTony VanBoang LiuXuefei ZhangJi ZhuSameer D SainiGrace L SuBrahmajee K NallamothuGeorge N IoannouAkbar K Waljee<h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data.<h4>Methods and findings</h4>We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000-2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC.<h4>Conclusions</h4>Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states.https://doi.org/10.1371/journal.pone.0208141 |
spellingShingle | Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee Machine learning models to predict disease progression among veterans with hepatitis C virus. PLoS ONE |
title | Machine learning models to predict disease progression among veterans with hepatitis C virus. |
title_full | Machine learning models to predict disease progression among veterans with hepatitis C virus. |
title_fullStr | Machine learning models to predict disease progression among veterans with hepatitis C virus. |
title_full_unstemmed | Machine learning models to predict disease progression among veterans with hepatitis C virus. |
title_short | Machine learning models to predict disease progression among veterans with hepatitis C virus. |
title_sort | machine learning models to predict disease progression among veterans with hepatitis c virus |
url | https://doi.org/10.1371/journal.pone.0208141 |
work_keys_str_mv | AT monicaakonerman machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT laurenabeste machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT tonyvan machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT boangliu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT xuefeizhang machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT jizhu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT sameerdsaini machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT gracelsu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT brahmajeeknallamothu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT georgenioannou machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT akbarkwaljee machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus |