Natural language processing for structuring clinical text data on depression using UK-CRIS

<p><strong>Background</strong> Utilisation of routinely collected electronic health records from secondary care offers unprecedented possibilities for medical science research but can also present difficulties. One key issue is that medical information is presented as free-form tex...

Full description

Bibliographic Details
Main Authors: Vaci, N, Liu, Q, Kormilitzin, A, De Crescenzo, F, Kurtulmus, A, Harvey, J, O'Dell, B, Innocent, S, Tomlinson, A, Cipriani, A, Nevado-Holgado, A
Format: Journal article
Published: BMJ Publishing Group 2020
_version_ 1797100997847285760
author Vaci, N
Liu, Q
Kormilitzin, A
De Crescenzo, F
Kurtulmus, A
Harvey, J
O'Dell, B
Innocent, S
Tomlinson, A
Cipriani, A
Nevado-Holgado, A
author_facet Vaci, N
Liu, Q
Kormilitzin, A
De Crescenzo, F
Kurtulmus, A
Harvey, J
O'Dell, B
Innocent, S
Tomlinson, A
Cipriani, A
Nevado-Holgado, A
author_sort Vaci, N
collection OXFORD
description <p><strong>Background</strong> Utilisation of routinely collected electronic health records from secondary care offers unprecedented possibilities for medical science research but can also present difficulties. One key issue is that medical information is presented as free-form text and, therefore, requires time commitment from clinicians to manually extract salient information. Natural language processing (NLP) methods can be used to automatically extract clinically relevant information.</p> <p><strong>Objective</strong> Our aim is to use natural language processing (NLP) to capture real-world data on individuals with depression from the Clinical Record Interactive Search (CRIS) clinical text to foster the use of electronic healthcare data in mental health research.</p> <p><strong>Methods</strong> We used a combination of methods to extract salient information from electronic health records. First, clinical experts define the information of interest and subsequently build the training and testing corpora for statistical models. Second, we built and fine-tuned the statistical models using active learning procedures.</p> <p><strong>Findings</strong> Results show a high degree of accuracy in the extraction of drug-related information. Contrastingly, a much lower degree of accuracy is demonstrated in relation to auxiliary variables. In combination with state-of-the-art active learning paradigms, the performance of the model increases considerably.</p> <p><strong>Conclusions</strong> This study illustrates the feasibility of using the natural language processing models and proposes a research pipeline to be used for accurately extracting information from electronic health records.</p> <p><strong>Clinical implications</strong> Real-world, individual patient data are an invaluable source of information, which can be used to better personalise treatment.</p>
first_indexed 2024-03-07T05:45:38Z
format Journal article
id oxford-uuid:e7200edd-6a6f-4556-bf57-4868c75a4bab
institution University of Oxford
last_indexed 2024-03-07T05:45:38Z
publishDate 2020
publisher BMJ Publishing Group
record_format dspace
spelling oxford-uuid:e7200edd-6a6f-4556-bf57-4868c75a4bab2022-03-27T10:36:19ZNatural language processing for structuring clinical text data on depression using UK-CRISJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:e7200edd-6a6f-4556-bf57-4868c75a4babSymplectic ElementsBMJ Publishing Group2020Vaci, NLiu, QKormilitzin, ADe Crescenzo, FKurtulmus, AHarvey, JO'Dell, BInnocent, STomlinson, ACipriani, ANevado-Holgado, A<p><strong>Background</strong> Utilisation of routinely collected electronic health records from secondary care offers unprecedented possibilities for medical science research but can also present difficulties. One key issue is that medical information is presented as free-form text and, therefore, requires time commitment from clinicians to manually extract salient information. Natural language processing (NLP) methods can be used to automatically extract clinically relevant information.</p> <p><strong>Objective</strong> Our aim is to use natural language processing (NLP) to capture real-world data on individuals with depression from the Clinical Record Interactive Search (CRIS) clinical text to foster the use of electronic healthcare data in mental health research.</p> <p><strong>Methods</strong> We used a combination of methods to extract salient information from electronic health records. First, clinical experts define the information of interest and subsequently build the training and testing corpora for statistical models. Second, we built and fine-tuned the statistical models using active learning procedures.</p> <p><strong>Findings</strong> Results show a high degree of accuracy in the extraction of drug-related information. Contrastingly, a much lower degree of accuracy is demonstrated in relation to auxiliary variables. In combination with state-of-the-art active learning paradigms, the performance of the model increases considerably.</p> <p><strong>Conclusions</strong> This study illustrates the feasibility of using the natural language processing models and proposes a research pipeline to be used for accurately extracting information from electronic health records.</p> <p><strong>Clinical implications</strong> Real-world, individual patient data are an invaluable source of information, which can be used to better personalise treatment.</p>
spellingShingle Vaci, N
Liu, Q
Kormilitzin, A
De Crescenzo, F
Kurtulmus, A
Harvey, J
O'Dell, B
Innocent, S
Tomlinson, A
Cipriani, A
Nevado-Holgado, A
Natural language processing for structuring clinical text data on depression using UK-CRIS
title Natural language processing for structuring clinical text data on depression using UK-CRIS
title_full Natural language processing for structuring clinical text data on depression using UK-CRIS
title_fullStr Natural language processing for structuring clinical text data on depression using UK-CRIS
title_full_unstemmed Natural language processing for structuring clinical text data on depression using UK-CRIS
title_short Natural language processing for structuring clinical text data on depression using UK-CRIS
title_sort natural language processing for structuring clinical text data on depression using uk cris
work_keys_str_mv AT vacin naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT liuq naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT kormilitzina naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT decrescenzof naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT kurtulmusa naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT harveyj naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT odellb naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT innocents naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT tomlinsona naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT cipriania naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris
AT nevadoholgadoa naturallanguageprocessingforstructuringclinicaltextdataondepressionusingukcris