Analysis of risk factor domains in psychosis patient health records

Abstract Background Readmission after discharge from a hospital is disruptive and costly, regardless of the reason. However, it can be particularly problematic for psychiatric patients, so predicting which patients may be readmitted is critically important but also very difficult. Clinical narrative...

Full description

Bibliographic Details
Main Authors: Eben Holderness, Nicholas Miller, Philip Cawkwell, Kirsten Bolton, Marie Meteer, James Pustejovsky, Mei-Hua Hall
Format: Article
Language:English
Published: BMC 2019-10-01
Series:Journal of Biomedical Semantics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13326-019-0210-8
_version_ 1819225392438837248
author Eben Holderness
Nicholas Miller
Philip Cawkwell
Kirsten Bolton
Marie Meteer
James Pustejovsky
Mei-Hua Hall
author_facet Eben Holderness
Nicholas Miller
Philip Cawkwell
Kirsten Bolton
Marie Meteer
James Pustejovsky
Mei-Hua Hall
author_sort Eben Holderness
collection DOAJ
description Abstract Background Readmission after discharge from a hospital is disruptive and costly, regardless of the reason. However, it can be particularly problematic for psychiatric patients, so predicting which patients may be readmitted is critically important but also very difficult. Clinical narratives in psychiatric electronic health records (EHRs) span a wide range of topics and vocabulary; therefore, a psychiatric readmission prediction model must begin with a robust and interpretable topic extraction component. Results We designed and evaluated multiple multilayer perceptron and radial basis function neural networks to predict the sentences in a patient’s EHR that are associated with one or more of seven readmission risk factor domains that we identified. In contrast to our baseline cosine similarity model that is based on the methodologies of prior works, our deep learning approaches achieved considerably better F1 scores (0.83 vs 0.66) while also being more scalable and computationally efficient with large volumes of data. Additionally, we found that integrating clinically relevant multiword expressions during preprocessing improves the accuracy of our models and allows for identifying a wider scope of training data in a semi-supervised setting. Conclusion We created a data pipeline for using document vector similarity metrics to perform topic extraction on psychiatric EHR data in service of our long-term goal of creating a readmission risk classifier. We show results for our topic extraction model and identify additional features we will be incorporating in the future.
first_indexed 2024-12-23T10:08:52Z
format Article
id doaj.art-21cc23baec5a447ea9d864d8e56c4a7b
institution Directory Open Access Journal
issn 2041-1480
language English
last_indexed 2024-12-23T10:08:52Z
publishDate 2019-10-01
publisher BMC
record_format Article
series Journal of Biomedical Semantics
spelling doaj.art-21cc23baec5a447ea9d864d8e56c4a7b2022-12-21T17:51:00ZengBMCJournal of Biomedical Semantics2041-14802019-10-0110111010.1186/s13326-019-0210-8Analysis of risk factor domains in psychosis patient health recordsEben Holderness0Nicholas Miller1Philip Cawkwell2Kirsten Bolton3Marie Meteer4James Pustejovsky5Mei-Hua Hall6Psychosis Neurobiology Laboratory, McLean Hospital, Harvard Medical SchoolPsychosis Neurobiology Laboratory, McLean Hospital, Harvard Medical SchoolPsychosis Neurobiology Laboratory, McLean Hospital, Harvard Medical SchoolPsychosis Neurobiology Laboratory, McLean Hospital, Harvard Medical SchoolBrandeis University Department of Computer ScienceBrandeis University Department of Computer SciencePsychosis Neurobiology Laboratory, McLean Hospital, Harvard Medical SchoolAbstract Background Readmission after discharge from a hospital is disruptive and costly, regardless of the reason. However, it can be particularly problematic for psychiatric patients, so predicting which patients may be readmitted is critically important but also very difficult. Clinical narratives in psychiatric electronic health records (EHRs) span a wide range of topics and vocabulary; therefore, a psychiatric readmission prediction model must begin with a robust and interpretable topic extraction component. Results We designed and evaluated multiple multilayer perceptron and radial basis function neural networks to predict the sentences in a patient’s EHR that are associated with one or more of seven readmission risk factor domains that we identified. In contrast to our baseline cosine similarity model that is based on the methodologies of prior works, our deep learning approaches achieved considerably better F1 scores (0.83 vs 0.66) while also being more scalable and computationally efficient with large volumes of data. Additionally, we found that integrating clinically relevant multiword expressions during preprocessing improves the accuracy of our models and allows for identifying a wider scope of training data in a semi-supervised setting. Conclusion We created a data pipeline for using document vector similarity metrics to perform topic extraction on psychiatric EHR data in service of our long-term goal of creating a readmission risk classifier. We show results for our topic extraction model and identify additional features we will be incorporating in the future.http://link.springer.com/article/10.1186/s13326-019-0210-8Natural language processingRisk predictionMachine learningElectronic health recordPsychotic disorders
spellingShingle Eben Holderness
Nicholas Miller
Philip Cawkwell
Kirsten Bolton
Marie Meteer
James Pustejovsky
Mei-Hua Hall
Analysis of risk factor domains in psychosis patient health records
Journal of Biomedical Semantics
Natural language processing
Risk prediction
Machine learning
Electronic health record
Psychotic disorders
title Analysis of risk factor domains in psychosis patient health records
title_full Analysis of risk factor domains in psychosis patient health records
title_fullStr Analysis of risk factor domains in psychosis patient health records
title_full_unstemmed Analysis of risk factor domains in psychosis patient health records
title_short Analysis of risk factor domains in psychosis patient health records
title_sort analysis of risk factor domains in psychosis patient health records
topic Natural language processing
Risk prediction
Machine learning
Electronic health record
Psychotic disorders
url http://link.springer.com/article/10.1186/s13326-019-0210-8
work_keys_str_mv AT ebenholderness analysisofriskfactordomainsinpsychosispatienthealthrecords
AT nicholasmiller analysisofriskfactordomainsinpsychosispatienthealthrecords
AT philipcawkwell analysisofriskfactordomainsinpsychosispatienthealthrecords
AT kirstenbolton analysisofriskfactordomainsinpsychosispatienthealthrecords
AT mariemeteer analysisofriskfactordomainsinpsychosispatienthealthrecords
AT jamespustejovsky analysisofriskfactordomainsinpsychosispatienthealthrecords
AT meihuahall analysisofriskfactordomainsinpsychosispatienthealthrecords