A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context
Summary: Background: Machine learning (ML) predictions are becoming increasingly integrated into medical practice. One commonly used method, ℓ1-penalised logistic regression (LASSO), can estimate patient risk for disease outcomes but is limited by only providing point estimates. Instead, Bayesian l...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-06-01
|
Series: | EBioMedicine |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352396423001974 |
_version_ | 1797812841800007680 |
---|---|
author | Claudio Fanconi Anne de Hond Dylan Peterson Angelo Capodici Tina Hernandez-Boussard |
author_facet | Claudio Fanconi Anne de Hond Dylan Peterson Angelo Capodici Tina Hernandez-Boussard |
author_sort | Claudio Fanconi |
collection | DOAJ |
description | Summary: Background: Machine learning (ML) predictions are becoming increasingly integrated into medical practice. One commonly used method, ℓ1-penalised logistic regression (LASSO), can estimate patient risk for disease outcomes but is limited by only providing point estimates. Instead, Bayesian logistic LASSO regression (BLLR) models provide distributions for risk predictions, giving clinicians a better understanding of predictive uncertainty, but they are not commonly implemented. Methods: This study evaluates the predictive performance of different BLLRs compared to standard logistic LASSO regression, using real-world, high-dimensional, structured electronic health record (EHR) data from cancer patients initiating chemotherapy at a comprehensive cancer centre. Multiple BLLR models were compared against a LASSO model using an 80–20 random split using 10-fold cross-validation to predict the risk of acute care utilization (ACU) after starting chemotherapy. Findings: This study included 8439 patients. The LASSO model predicted ACU with an area under the receiver operating characteristic curve (AUROC) of 0.806 (95% CI: 0.775–0.834). BLLR with a Horseshoe+ prior and a posterior approximated by Metropolis–Hastings sampling showed similar performance: 0.807 (95% CI: 0.780–0.834) and offers the advantage of uncertainty estimation for each prediction. In addition, BLLR could identify predictions too uncertain to be automatically classified. BLLR uncertainties were stratified by different patient subgroups, demonstrating that predictive uncertainties significantly differ across race, cancer type, and stage. Interpretation: BLLRs are a promising yet underutilised tool that increases explainability by providing risk estimates while offering a similar level of performance to standard LASSO-based models. Additionally, these models can identify patient subgroups with higher uncertainty, which can augment clinical decision-making. Funding: This work was supported in part by the National Library Of Medicine of the National Institutes of Health under Award Number R01LM013362. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. |
first_indexed | 2024-03-13T07:43:21Z |
format | Article |
id | doaj.art-f87a79b718e04855b1ea287d10e290fa |
institution | Directory Open Access Journal |
issn | 2352-3964 |
language | English |
last_indexed | 2024-03-13T07:43:21Z |
publishDate | 2023-06-01 |
publisher | Elsevier |
record_format | Article |
series | EBioMedicine |
spelling | doaj.art-f87a79b718e04855b1ea287d10e290fa2023-06-03T04:22:17ZengElsevierEBioMedicine2352-39642023-06-0192104632A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in contextClaudio Fanconi0Anne de Hond1Dylan Peterson2Angelo Capodici3Tina Hernandez-Boussard4Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland; Department of Medicine (Biomedical Informatics), Stanford University, Stanford, USADepartment of Medicine (Biomedical Informatics), Stanford University, Stanford, USA; Clinical AI Implementation and Research Lab, Leiden University Medical Centre, Leiden, the NetherlandsDepartment of Medicine (Biomedical Informatics), Stanford University, Stanford, USADepartment of Medicine (Biomedical Informatics), Stanford University, Stanford, USA; Department of Biomedical and Neuromotor Science, University of Bologna, Bologna, ItalyDepartment of Medicine (Biomedical Informatics), Stanford University, Stanford, USA; Corresponding author. Stanford University, 453 Quarry Road, Palo Alto, CA, 94304, USA.Summary: Background: Machine learning (ML) predictions are becoming increasingly integrated into medical practice. One commonly used method, ℓ1-penalised logistic regression (LASSO), can estimate patient risk for disease outcomes but is limited by only providing point estimates. Instead, Bayesian logistic LASSO regression (BLLR) models provide distributions for risk predictions, giving clinicians a better understanding of predictive uncertainty, but they are not commonly implemented. Methods: This study evaluates the predictive performance of different BLLRs compared to standard logistic LASSO regression, using real-world, high-dimensional, structured electronic health record (EHR) data from cancer patients initiating chemotherapy at a comprehensive cancer centre. Multiple BLLR models were compared against a LASSO model using an 80–20 random split using 10-fold cross-validation to predict the risk of acute care utilization (ACU) after starting chemotherapy. Findings: This study included 8439 patients. The LASSO model predicted ACU with an area under the receiver operating characteristic curve (AUROC) of 0.806 (95% CI: 0.775–0.834). BLLR with a Horseshoe+ prior and a posterior approximated by Metropolis–Hastings sampling showed similar performance: 0.807 (95% CI: 0.780–0.834) and offers the advantage of uncertainty estimation for each prediction. In addition, BLLR could identify predictions too uncertain to be automatically classified. BLLR uncertainties were stratified by different patient subgroups, demonstrating that predictive uncertainties significantly differ across race, cancer type, and stage. Interpretation: BLLRs are a promising yet underutilised tool that increases explainability by providing risk estimates while offering a similar level of performance to standard LASSO-based models. Additionally, these models can identify patient subgroups with higher uncertainty, which can augment clinical decision-making. Funding: This work was supported in part by the National Library Of Medicine of the National Institutes of Health under Award Number R01LM013362. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.http://www.sciencedirect.com/science/article/pii/S2352396423001974Bayesian logistic LASSO regressionPredictive uncertaintyAcute care utilizationChemotherapy |
spellingShingle | Claudio Fanconi Anne de Hond Dylan Peterson Angelo Capodici Tina Hernandez-Boussard A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context EBioMedicine Bayesian logistic LASSO regression Predictive uncertainty Acute care utilization Chemotherapy |
title | A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context |
title_full | A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context |
title_fullStr | A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context |
title_full_unstemmed | A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context |
title_short | A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationResearch in context |
title_sort | bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilizationresearch in context |
topic | Bayesian logistic LASSO regression Predictive uncertainty Acute care utilization Chemotherapy |
url | http://www.sciencedirect.com/science/article/pii/S2352396423001974 |
work_keys_str_mv | AT claudiofanconi abayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT annedehond abayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT dylanpeterson abayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT angelocapodici abayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT tinahernandezboussard abayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT claudiofanconi bayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT annedehond bayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT dylanpeterson bayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT angelocapodici bayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext AT tinahernandezboussard bayesianapproachtopredictiveuncertaintyinchemotherapypatientsatriskofacutecareutilizationresearchincontext |