Machine learning models to predict disease progression among veterans with hepatitis C virus.

<h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models...

Full description

Bibliographic Details
Main Authors:	Monica A Konerman, Lauren A Beste, Tony Van, Boang Liu, Xuefei Zhang, Ji Zhu, Sameer D Saini, Grace L Su, Brahmajee K Nallamothu, George N Ioannou, Akbar K Waljee
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2019-01-01
Series:	PLoS ONE
Online Access:	https://doi.org/10.1371/journal.pone.0208141

_version_	1798024961446641664
author	Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee
author_facet	Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee
author_sort	Monica A Konerman
collection	DOAJ
description	<h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data.<h4>Methods and findings</h4>We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000-2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC.<h4>Conclusions</h4>Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states.
first_indexed	2024-04-11T18:12:19Z
format	Article
id	doaj.art-b76fb559e94243198753ba5237f302c7
institution	Directory Open Access Journal
issn	1932-6203
language	English
last_indexed	2024-04-11T18:12:19Z
publishDate	2019-01-01
publisher	Public Library of Science (PLoS)
record_format	Article
series	PLoS ONE
spelling	doaj.art-b76fb559e94243198753ba5237f302c72022-12-22T04:10:06ZengPublic Library of Science (PLoS)PLoS ONE1932-62032019-01-01141e020814110.1371/journal.pone.0208141Machine learning models to predict disease progression among veterans with hepatitis C virus.Monica A KonermanLauren A BesteTony VanBoang LiuXuefei ZhangJi ZhuSameer D SainiGrace L SuBrahmajee K NallamothuGeorge N IoannouAkbar K Waljee<h4>Background</h4>Machine learning (ML) algorithms provide effective ways to build prediction models using longitudinal information given their capacity to incorporate numerous predictor variables without compromising the accuracy of the risk prediction. Clinical risk prediction models in chronic hepatitis C virus (CHC) can be challenging due to non-linear nature of disease progression. We developed and compared two ML algorithms to predict cirrhosis development in a large CHC-infected cohort using longitudinal data.<h4>Methods and findings</h4>We used national Veterans Health Administration (VHA) data to identify CHC patients in care between 2000-2016. The primary outcome was cirrhosis development ascertained by two consecutive aspartate aminotransferase (AST)-to-platelet ratio indexes (APRIs) > 2 after time zero given the infrequency of liver biopsy in clinical practice and that APRI is a validated non-invasive biomarker of fibrosis in CHC. We excluded those with initial APRI > 2 or pre-existing diagnosis of cirrhosis, hepatocellular carcinoma or hepatic decompensation. Enrollment was defined as the date of the first APRI. Time zero was defined as 2 years after enrollment. Cross-sectional (CS) models used predictors at or closest before time zero as a comparison. Longitudinal models used CS predictors plus longitudinal summary variables (maximum, minimum, maximum of slope, minimum of slope and total variation) between enrollment and time zero. Covariates included demographics, labs, and body mass index. Model performance was evaluated using concordance and area under the receiver operating curve (AuROC). A total of 72,683 individuals with CHC were analyzed with the cohort having a mean age of 52.8, 96.8% male and 53% white. There are 11,616 individuals (16%) who met the primary outcome over a mean follow-up of 7 years. We found superior predictive performance for the longitudinal Cox model compared to the CS Cox model (concordance 0.764 vs 0.746), and for the longitudinal boosted-survival-tree model compared to the linear Cox model (concordance 0.774 vs 0.764). The accuracy of the longitudinal models at 1,3,5 years after time zero also showed superior performance compared to the CS model, based on AuROC.<h4>Conclusions</h4>Boosted-survival-tree based models using longitudinal information are statistically superior to cross-sectional or linear models for predicting development of cirrhosis in CHC, though all four models were highly accurate. Similar statistical methods could be applied to predict outcomes in other non-linear chronic disease states.https://doi.org/10.1371/journal.pone.0208141
spellingShingle	Monica A Konerman Lauren A Beste Tony Van Boang Liu Xuefei Zhang Ji Zhu Sameer D Saini Grace L Su Brahmajee K Nallamothu George N Ioannou Akbar K Waljee Machine learning models to predict disease progression among veterans with hepatitis C virus. PLoS ONE
title	Machine learning models to predict disease progression among veterans with hepatitis C virus.
title_full	Machine learning models to predict disease progression among veterans with hepatitis C virus.
title_fullStr	Machine learning models to predict disease progression among veterans with hepatitis C virus.
title_full_unstemmed	Machine learning models to predict disease progression among veterans with hepatitis C virus.
title_short	Machine learning models to predict disease progression among veterans with hepatitis C virus.
title_sort	machine learning models to predict disease progression among veterans with hepatitis c virus
url	https://doi.org/10.1371/journal.pone.0208141
work_keys_str_mv	AT monicaakonerman machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT laurenabeste machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT tonyvan machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT boangliu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT xuefeizhang machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT jizhu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT sameerdsaini machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT gracelsu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT brahmajeeknallamothu machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT georgenioannou machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus AT akbarkwaljee machinelearningmodelstopredictdiseaseprogressionamongveteranswithhepatitiscvirus

Machine learning models to predict disease progression among veterans with hepatitis C virus.

Similar Items