Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study

BackgroundPatients with cancer starting systemic treatment programs, such as chemotherapy, often develop depression. A prediction model may assist physicians and health care workers in the early identification of these vulnerable patients. ObjectiveThis study aime...

Full description

Bibliographic Details
Main Authors:	Anne de Hond, Marieke van Buchem, Claudio Fanconi, Mohana Roy, Douglas Blayney, Ilse Kant, Ewout Steyerberg, Tina Hernandez-Boussard
Format:	Article
Language:	English
Published:	JMIR Publications 2024-01-01
Series:	JMIR Medical Informatics
Online Access:	https://medinform.jmir.org/2024/1/e51925

_version_	1827379258338574336
author	Anne de Hond Marieke van Buchem Claudio Fanconi Mohana Roy Douglas Blayney Ilse Kant Ewout Steyerberg Tina Hernandez-Boussard
author_facet	Anne de Hond Marieke van Buchem Claudio Fanconi Mohana Roy Douglas Blayney Ilse Kant Ewout Steyerberg Tina Hernandez-Boussard
author_sort	Anne de Hond
collection	DOAJ
description	BackgroundPatients with cancer starting systemic treatment programs, such as chemotherapy, often develop depression. A prediction model may assist physicians and health care workers in the early identification of these vulnerable patients. ObjectiveThis study aimed to develop a prediction model for depression risk within the first month of cancer treatment. MethodsWe included 16,159 patients diagnosed with cancer starting chemo- or radiotherapy treatment between 2008 and 2021. Machine learning models (eg, least absolute shrinkage and selection operator [LASSO] logistic regression) and natural language processing models (Bidirectional Encoder Representations from Transformers [BERT]) were used to develop multimodal prediction models using both electronic health record data and unstructured text (patient emails and clinician notes). Model performance was assessed in an independent test set (n=5387, 33%) using area under the receiver operating characteristic curve (AUROC), calibration curves, and decision curve analysis to assess initial clinical impact use. ResultsAmong 16,159 patients, 437 (2.7%) received a depression diagnosis within the first month of treatment. The LASSO logistic regression models based on the structured data (AUROC 0.74, 95% CI 0.71-0.78) and structured data with email classification scores (AUROC 0.74, 95% CI 0.71-0.78) had the best discriminative performance. The BERT models based on clinician notes and structured data with email classification scores had AUROCs around 0.71. The logistic regression model based on email classification scores alone performed poorly (AUROC 0.54, 95% CI 0.52-0.56), and the model based solely on clinician notes had the worst performance (AUROC 0.50, 95% CI 0.49-0.52). Calibration was good for the logistic regression models, whereas the BERT models produced overly extreme risk estimates even after recalibration. There was a small range of decision thresholds for which the best-performing model showed promising clinical effectiveness use. The risks were underestimated for female and Black patients. ConclusionsThe results demonstrated the potential and limitations of machine learning and multimodal models for predicting depression risk in patients with cancer. Future research is needed to further validate these models, refine the outcome label and predictors related to mental health, and address biases across subgroups.
first_indexed	2024-03-08T13:10:41Z
format	Article
id	doaj.art-c3aabf3b878948349da52c9b3919b68b
institution	Directory Open Access Journal
issn	2291-9694
language	English
last_indexed	2024-03-08T13:10:41Z
publishDate	2024-01-01
publisher	JMIR Publications
record_format	Article
series	JMIR Medical Informatics
spelling	doaj.art-c3aabf3b878948349da52c9b3919b68b2024-01-18T14:45:44ZengJMIR PublicationsJMIR Medical Informatics2291-96942024-01-0112e5192510.2196/51925Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development StudyAnne de Hondhttps://orcid.org/0000-0002-3473-3398Marieke van Buchemhttps://orcid.org/0000-0002-2917-0842Claudio Fanconihttps://orcid.org/0000-0001-5308-3821Mohana Royhttps://orcid.org/0000-0002-9997-8935Douglas Blayneyhttps://orcid.org/0000-0002-7931-4533Ilse Kanthttps://orcid.org/0000-0002-5273-5178Ewout Steyerberghttps://orcid.org/0000-0002-7787-0122Tina Hernandez-Boussardhttps://orcid.org/0000-0001-6553-3455 BackgroundPatients with cancer starting systemic treatment programs, such as chemotherapy, often develop depression. A prediction model may assist physicians and health care workers in the early identification of these vulnerable patients. ObjectiveThis study aimed to develop a prediction model for depression risk within the first month of cancer treatment. MethodsWe included 16,159 patients diagnosed with cancer starting chemo- or radiotherapy treatment between 2008 and 2021. Machine learning models (eg, least absolute shrinkage and selection operator [LASSO] logistic regression) and natural language processing models (Bidirectional Encoder Representations from Transformers [BERT]) were used to develop multimodal prediction models using both electronic health record data and unstructured text (patient emails and clinician notes). Model performance was assessed in an independent test set (n=5387, 33%) using area under the receiver operating characteristic curve (AUROC), calibration curves, and decision curve analysis to assess initial clinical impact use. ResultsAmong 16,159 patients, 437 (2.7%) received a depression diagnosis within the first month of treatment. The LASSO logistic regression models based on the structured data (AUROC 0.74, 95% CI 0.71-0.78) and structured data with email classification scores (AUROC 0.74, 95% CI 0.71-0.78) had the best discriminative performance. The BERT models based on clinician notes and structured data with email classification scores had AUROCs around 0.71. The logistic regression model based on email classification scores alone performed poorly (AUROC 0.54, 95% CI 0.52-0.56), and the model based solely on clinician notes had the worst performance (AUROC 0.50, 95% CI 0.49-0.52). Calibration was good for the logistic regression models, whereas the BERT models produced overly extreme risk estimates even after recalibration. There was a small range of decision thresholds for which the best-performing model showed promising clinical effectiveness use. The risks were underestimated for female and Black patients. ConclusionsThe results demonstrated the potential and limitations of machine learning and multimodal models for predicting depression risk in patients with cancer. Future research is needed to further validate these models, refine the outcome label and predictors related to mental health, and address biases across subgroups.https://medinform.jmir.org/2024/1/e51925
spellingShingle	Anne de Hond Marieke van Buchem Claudio Fanconi Mohana Roy Douglas Blayney Ilse Kant Ewout Steyerberg Tina Hernandez-Boussard Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study JMIR Medical Informatics
title	Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study
title_full	Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study
title_fullStr	Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study
title_full_unstemmed	Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study
title_short	Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study
title_sort	predicting depression risk in patients with cancer using multimodal data algorithm development study
url	https://medinform.jmir.org/2024/1/e51925
work_keys_str_mv	AT annedehond predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT mariekevanbuchem predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT claudiofanconi predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT mohanaroy predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT douglasblayney predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT ilsekant predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT ewoutsteyerberg predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy AT tinahernandezboussard predictingdepressionriskinpatientswithcancerusingmultimodaldataalgorithmdevelopmentstudy

Predicting Depression Risk in Patients With Cancer Using Multimodal Data: Algorithm Development Study

Similar Items