Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods

BackgroundIdentifying patients at a high risk of psychosis relapse is crucial for early interventions. A relevant psychiatric clinical context is often recorded in clinical notes; however, the utilization of unstructured data remains limited. This study aimed to develop psychosis-relapse prediction...

Full description

Bibliographic Details
Main Authors: Dong Yun Lee, Chungsoo Kim, Seongwon Lee, Sang Joon Son, Sun-Mi Cho, Yong Hyuk Cho, Jaegyun Lim, Rae Woong Park
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-04-01
Series:Frontiers in Psychiatry
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyt.2022.844442/full
_version_ 1818670355321978880
author Dong Yun Lee
Chungsoo Kim
Seongwon Lee
Seongwon Lee
Sang Joon Son
Sun-Mi Cho
Yong Hyuk Cho
Jaegyun Lim
Rae Woong Park
Rae Woong Park
author_facet Dong Yun Lee
Chungsoo Kim
Seongwon Lee
Seongwon Lee
Sang Joon Son
Sun-Mi Cho
Yong Hyuk Cho
Jaegyun Lim
Rae Woong Park
Rae Woong Park
author_sort Dong Yun Lee
collection DOAJ
description BackgroundIdentifying patients at a high risk of psychosis relapse is crucial for early interventions. A relevant psychiatric clinical context is often recorded in clinical notes; however, the utilization of unstructured data remains limited. This study aimed to develop psychosis-relapse prediction models using various types of clinical notes and structured data.MethodsClinical data were extracted from the electronic health records of the Ajou University Medical Center in South Korea. The study population included patients with psychotic disorders, and outcome was psychosis relapse within 1 year. Using only structured data, we developed an initial prediction model, then three natural language processing (NLP)-enriched models using three types of clinical notes (psychological tests, admission notes, and initial nursing assessment) and one complete model. Latent Dirichlet Allocation was used to cluster the clinical context into similar topics. All models applied the least absolute shrinkage and selection operator logistic regression algorithm. We also performed an external validation using another hospital database.ResultsA total of 330 patients were included, and 62 (18.8%) experienced psychosis relapse. Six predictors were used in the initial model and 10 additional topics from Latent Dirichlet Allocation processing were added in the enriched models. The model derived from all notes showed the highest value of the area under the receiver operating characteristic (AUROC = 0.946) in the internal validation, followed by models based on the psychological test notes, admission notes, initial nursing assessments, and structured data only (0.902, 0.855, 0.798, and 0.784, respectively). The external validation was performed using only the initial nursing assessment note, and the AUROC was 0.616.ConclusionsWe developed prediction models for psychosis relapse using the NLP-enrichment method. Models using clinical notes were more effective than models using only structured data, suggesting the importance of unstructured data in psychosis prediction.
first_indexed 2024-12-17T07:06:47Z
format Article
id doaj.art-66013f0b20ab48ee8496efcc27019ac4
institution Directory Open Access Journal
issn 1664-0640
language English
last_indexed 2024-12-17T07:06:47Z
publishDate 2022-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Psychiatry
spelling doaj.art-66013f0b20ab48ee8496efcc27019ac42022-12-21T21:59:08ZengFrontiers Media S.A.Frontiers in Psychiatry1664-06402022-04-011310.3389/fpsyt.2022.844442844442Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment MethodsDong Yun Lee0Chungsoo Kim1Seongwon Lee2Seongwon Lee3Sang Joon Son4Sun-Mi Cho5Yong Hyuk Cho6Jaegyun Lim7Rae Woong Park8Rae Woong Park9Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, South KoreaDepartment of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South KoreaDepartment of Biomedical Informatics, Ajou University School of Medicine, Suwon, South KoreaDepartment of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South KoreaDepartment of Psychiatry, Ajou University School of Medicine, Suwon, South KoreaDepartment of Psychiatry, Ajou University School of Medicine, Suwon, South KoreaDepartment of Psychiatry, Ajou University School of Medicine, Suwon, South KoreaDepartment of Laboratory Medicine, Myongji Hospital, Hanyang University College of Medicine, Goyang, South KoreaDepartment of Biomedical Informatics, Ajou University School of Medicine, Suwon, South KoreaDepartment of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, South KoreaBackgroundIdentifying patients at a high risk of psychosis relapse is crucial for early interventions. A relevant psychiatric clinical context is often recorded in clinical notes; however, the utilization of unstructured data remains limited. This study aimed to develop psychosis-relapse prediction models using various types of clinical notes and structured data.MethodsClinical data were extracted from the electronic health records of the Ajou University Medical Center in South Korea. The study population included patients with psychotic disorders, and outcome was psychosis relapse within 1 year. Using only structured data, we developed an initial prediction model, then three natural language processing (NLP)-enriched models using three types of clinical notes (psychological tests, admission notes, and initial nursing assessment) and one complete model. Latent Dirichlet Allocation was used to cluster the clinical context into similar topics. All models applied the least absolute shrinkage and selection operator logistic regression algorithm. We also performed an external validation using another hospital database.ResultsA total of 330 patients were included, and 62 (18.8%) experienced psychosis relapse. Six predictors were used in the initial model and 10 additional topics from Latent Dirichlet Allocation processing were added in the enriched models. The model derived from all notes showed the highest value of the area under the receiver operating characteristic (AUROC = 0.946) in the internal validation, followed by models based on the psychological test notes, admission notes, initial nursing assessments, and structured data only (0.902, 0.855, 0.798, and 0.784, respectively). The external validation was performed using only the initial nursing assessment note, and the AUROC was 0.616.ConclusionsWe developed prediction models for psychosis relapse using the NLP-enrichment method. Models using clinical notes were more effective than models using only structured data, suggesting the importance of unstructured data in psychosis prediction.https://www.frontiersin.org/articles/10.3389/fpsyt.2022.844442/fullnatural language processingpsychotic disorderrecurrencemodelsstatisticalelectronic health records
spellingShingle Dong Yun Lee
Chungsoo Kim
Seongwon Lee
Seongwon Lee
Sang Joon Son
Sun-Mi Cho
Yong Hyuk Cho
Jaegyun Lim
Rae Woong Park
Rae Woong Park
Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
Frontiers in Psychiatry
natural language processing
psychotic disorder
recurrence
models
statistical
electronic health records
title Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
title_full Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
title_fullStr Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
title_full_unstemmed Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
title_short Psychosis Relapse Prediction Leveraging Electronic Health Records Data and Natural Language Processing Enrichment Methods
title_sort psychosis relapse prediction leveraging electronic health records data and natural language processing enrichment methods
topic natural language processing
psychotic disorder
recurrence
models
statistical
electronic health records
url https://www.frontiersin.org/articles/10.3389/fpsyt.2022.844442/full
work_keys_str_mv AT dongyunlee psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT chungsookim psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT seongwonlee psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT seongwonlee psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT sangjoonson psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT sunmicho psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT yonghyukcho psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT jaegyunlim psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT raewoongpark psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods
AT raewoongpark psychosisrelapsepredictionleveragingelectronichealthrecordsdataandnaturallanguageprocessingenrichmentmethods