Predicting early psychiatric readmission with natural language processing of narrative discharge summaries

The ability to predict psychiatric readmission would facilitate the development of interventions to reduce this risk, a major driver of psychiatric health-care costs. The symptoms or characteristics of illness course necessary to develop reliable predictors are not available in coded billing data, b...

Full description

Bibliographic Details
Main Authors: Castro, V M, McCoy, T H, Perlis, R H, Naumann, Tristan, Szolovits, Peter, Rumshisky, Anna A., Ghassemi, Marzyeh
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Nature Publishing Group 2017
Online Access:http://hdl.handle.net/1721.1/108225
https://orcid.org/0000-0003-2150-1747
https://orcid.org/0000-0001-8411-6403
https://orcid.org/0000-0002-8029-0823
https://orcid.org/0000-0001-6349-7251
_version_ 1826197944666161152
author Castro, V M
McCoy, T H
Perlis, R H
Naumann, Tristan
Szolovits, Peter
Rumshisky, Anna A.
Ghassemi, Marzyeh
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Castro, V M
McCoy, T H
Perlis, R H
Naumann, Tristan
Szolovits, Peter
Rumshisky, Anna A.
Ghassemi, Marzyeh
author_sort Castro, V M
collection MIT
description The ability to predict psychiatric readmission would facilitate the development of interventions to reduce this risk, a major driver of psychiatric health-care costs. The symptoms or characteristics of illness course necessary to develop reliable predictors are not available in coded billing data, but may be present in narrative electronic health record (EHR) discharge summaries. We identified a cohort of individuals admitted to a psychiatric inpatient unit between 1994 and 2012 with a principal diagnosis of major depressive disorder, and extracted inpatient psychiatric discharge narrative notes. Using these data, we trained a 75-topic Latent Dirichlet Allocation (LDA) model, a form of natural language processing, which identifies groups of words associated with topics discussed in a document collection. The cohort was randomly split to derive a training (70%) and testing (30%) data set, and we trained separate support vector machine models for baseline clinical features alone, baseline features plus common individual words and the above plus topics identified from the 75-topic LDA model. Of 4687 patients with inpatient discharge summaries, 470 were readmitted within 30 days. The 75-topic LDA model included topics linked to psychiatric symptoms (suicide, severe depression, anxiety, trauma, eating/weight and panic) and major depressive disorder comorbidities (infection, postpartum, brain tumor, diarrhea and pulmonary disease). By including LDA topics, prediction of readmission, as measured by area under receiver-operating characteristic curves in the testing data set, was improved from baseline (area under the curve 0.618) to baseline+1000 words (0.682) to baseline+75 topics (0.784). Inclusion of topics derived from narrative notes allows more accurate discrimination of individuals at high risk for psychiatric readmission in this cohort. Topic modeling and related approaches offer the potential to improve prediction using EHRs, if generalizability can be established in other clinical cohorts.
first_indexed 2024-09-23T10:56:10Z
format Article
id mit-1721.1/108225
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T10:56:10Z
publishDate 2017
publisher Nature Publishing Group
record_format dspace
spelling mit-1721.1/1082252022-09-27T16:03:49Z Predicting early psychiatric readmission with natural language processing of narrative discharge summaries Castro, V M McCoy, T H Perlis, R H Naumann, Tristan Szolovits, Peter Rumshisky, Anna A. Ghassemi, Marzyeh Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Naumann, Tristan Szolovits, Peter Rumshisky, Anna A. Ghassemi, Marzyeh The ability to predict psychiatric readmission would facilitate the development of interventions to reduce this risk, a major driver of psychiatric health-care costs. The symptoms or characteristics of illness course necessary to develop reliable predictors are not available in coded billing data, but may be present in narrative electronic health record (EHR) discharge summaries. We identified a cohort of individuals admitted to a psychiatric inpatient unit between 1994 and 2012 with a principal diagnosis of major depressive disorder, and extracted inpatient psychiatric discharge narrative notes. Using these data, we trained a 75-topic Latent Dirichlet Allocation (LDA) model, a form of natural language processing, which identifies groups of words associated with topics discussed in a document collection. The cohort was randomly split to derive a training (70%) and testing (30%) data set, and we trained separate support vector machine models for baseline clinical features alone, baseline features plus common individual words and the above plus topics identified from the 75-topic LDA model. Of 4687 patients with inpatient discharge summaries, 470 were readmitted within 30 days. The 75-topic LDA model included topics linked to psychiatric symptoms (suicide, severe depression, anxiety, trauma, eating/weight and panic) and major depressive disorder comorbidities (infection, postpartum, brain tumor, diarrhea and pulmonary disease). By including LDA topics, prediction of readmission, as measured by area under receiver-operating characteristic curves in the testing data set, was improved from baseline (area under the curve 0.618) to baseline+1000 words (0.682) to baseline+75 topics (0.784). Inclusion of topics derived from narrative notes allows more accurate discrimination of individuals at high risk for psychiatric readmission in this cohort. Topic modeling and related approaches offer the potential to improve prediction using EHRs, if generalizability can be established in other clinical cohorts. 2017-04-18T19:08:43Z 2017-04-18T19:08:43Z 2016-10 2015-08 Article http://purl.org/eprint/type/JournalArticle 2158-3188 http://hdl.handle.net/1721.1/108225 Rumshisky, A et al. “Predicting Early Psychiatric Readmission with Natural Language Processing of Narrative Discharge Summaries.” Translational Psychiatry 6.10 (2016): e921. https://orcid.org/0000-0003-2150-1747 https://orcid.org/0000-0001-8411-6403 https://orcid.org/0000-0002-8029-0823 https://orcid.org/0000-0001-6349-7251 en_US http://dx.doi.org/10.1038/tp.2015.182 Translational Psychiatry Creative Commons Attribution-NonCommercial-NoDerivs License http://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Nature Publishing Group Nature
spellingShingle Castro, V M
McCoy, T H
Perlis, R H
Naumann, Tristan
Szolovits, Peter
Rumshisky, Anna A.
Ghassemi, Marzyeh
Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title_full Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title_fullStr Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title_full_unstemmed Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title_short Predicting early psychiatric readmission with natural language processing of narrative discharge summaries
title_sort predicting early psychiatric readmission with natural language processing of narrative discharge summaries
url http://hdl.handle.net/1721.1/108225
https://orcid.org/0000-0003-2150-1747
https://orcid.org/0000-0001-8411-6403
https://orcid.org/0000-0002-8029-0823
https://orcid.org/0000-0001-6349-7251
work_keys_str_mv AT castrovm predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT mccoyth predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT perlisrh predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT naumanntristan predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT szolovitspeter predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT rumshiskyannaa predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries
AT ghassemimarzyeh predictingearlypsychiatricreadmissionwithnaturallanguageprocessingofnarrativedischargesummaries