Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network

Abstract Background Important clinical information of patients is present in unstructured free-text fields of Electronic Health Records (EHRs). While this information can be extracted using clinical Natural Language Processing (cNLP), the recognition of negation modifiers represents an important cha...

Full description

Bibliographic Details
Main Authors:	Guillermo Argüello-González, José Aquino-Esperanza, Daniel Salvador, Rosa Bretón-Romero, Carlos Del Río-Bermudez, Jorge Tello, Sebastian Menke
Format:	Article
Language:	English
Published:	BMC 2023-10-01
Series:	BMC Medical Informatics and Decision Making
Subjects:	Negation NegEx CNN Electronic health records Clinical Natural Language Processing
Online Access:	https://doi.org/10.1186/s12911-023-02301-5

_version_	1797559268414586880
author	Guillermo Argüello-González José Aquino-Esperanza Daniel Salvador Rosa Bretón-Romero Carlos Del Río-Bermudez Jorge Tello Sebastian Menke
author_facet	Guillermo Argüello-González José Aquino-Esperanza Daniel Salvador Rosa Bretón-Romero Carlos Del Río-Bermudez Jorge Tello Sebastian Menke
author_sort	Guillermo Argüello-González
collection	DOAJ
description	Abstract Background Important clinical information of patients is present in unstructured free-text fields of Electronic Health Records (EHRs). While this information can be extracted using clinical Natural Language Processing (cNLP), the recognition of negation modifiers represents an important challenge. A wide range of cNLP applications have been developed to detect the negation of medical entities in clinical free-text, however, effective solutions for languages other than English are scarce. This study aimed at developing a solution for negation recognition in Spanish EHRs based on a combination of a customized rule-based NegEx layer and a convolutional neural network (CNN). Methods Based on our previous experience in real world evidence (RWE) studies using information embedded in EHRs, negation recognition was simplified into a binary problem (‘affirmative’ vs. ‘non-affirmative’ class). For the NegEx layer, negation rules were obtained from a publicly available Spanish corpus and enriched with custom ones, whereby the CNN binary classifier was trained on EHRs annotated for clinical named entities (cNEs) and negation markers by medical doctors. Results The proposed negation recognition pipeline obtained precision, recall, and F1-score of 0.93, 0.94, and 0.94 for the ‘affirmative’ class, and 0.86, 0.84, and 0.85 for the ‘non-affirmative’ class, respectively. To validate the generalization capabilities of our methodology, we applied the negation recognition pipeline on EHRs (6,710 cNEs) from a different data source distribution than the training corpus and obtained consistent performance metrics for the ‘affirmative’ and ‘non-affirmative’ class (0.95, 0.97, and 0.96; and 0.90, 0.83, and 0.86 for precision, recall, and F1-score, respectively). Lastly, we evaluated the pipeline against two publicly available Spanish negation corpora, the IULA and NUBes, obtaining state-of-the-art metrics (1.00, 0.99, and 0.99; and 1.00, 0.93, and 0.96 for precision, recall, and F1-score, respectively). Conclusion Negation recognition is a source of low precision in the retrieval of cNEs from EHRs’ free-text. Combining a customized rule-based NegEx layer with a CNN binary classifier outperformed many other current approaches. RWE studies highly benefit from the correct recognition of negation as it reduces false positive detections of cNE which otherwise would undoubtedly reduce the credibility of cNLP systems.
first_indexed	2024-03-10T17:43:01Z
format	Article
id	doaj.art-a4996e079f03446fa09bc8cb90a21571
institution	Directory Open Access Journal
issn	1472-6947
language	English
last_indexed	2024-03-10T17:43:01Z
publishDate	2023-10-01
publisher	BMC
record_format	Article
series	BMC Medical Informatics and Decision Making
spelling	doaj.art-a4996e079f03446fa09bc8cb90a215712023-11-20T09:38:25ZengBMCBMC Medical Informatics and Decision Making1472-69472023-10-012311910.1186/s12911-023-02301-5Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural networkGuillermo Argüello-González0José Aquino-Esperanza1Daniel Salvador2Rosa Bretón-Romero3Carlos Del Río-Bermudez4Jorge Tello5Sebastian Menke6MedSavana SLMedSavana SLMedSavana SLSavana ResearchSavana ResearchMedSavana SLMedSavana SLAbstract Background Important clinical information of patients is present in unstructured free-text fields of Electronic Health Records (EHRs). While this information can be extracted using clinical Natural Language Processing (cNLP), the recognition of negation modifiers represents an important challenge. A wide range of cNLP applications have been developed to detect the negation of medical entities in clinical free-text, however, effective solutions for languages other than English are scarce. This study aimed at developing a solution for negation recognition in Spanish EHRs based on a combination of a customized rule-based NegEx layer and a convolutional neural network (CNN). Methods Based on our previous experience in real world evidence (RWE) studies using information embedded in EHRs, negation recognition was simplified into a binary problem (‘affirmative’ vs. ‘non-affirmative’ class). For the NegEx layer, negation rules were obtained from a publicly available Spanish corpus and enriched with custom ones, whereby the CNN binary classifier was trained on EHRs annotated for clinical named entities (cNEs) and negation markers by medical doctors. Results The proposed negation recognition pipeline obtained precision, recall, and F1-score of 0.93, 0.94, and 0.94 for the ‘affirmative’ class, and 0.86, 0.84, and 0.85 for the ‘non-affirmative’ class, respectively. To validate the generalization capabilities of our methodology, we applied the negation recognition pipeline on EHRs (6,710 cNEs) from a different data source distribution than the training corpus and obtained consistent performance metrics for the ‘affirmative’ and ‘non-affirmative’ class (0.95, 0.97, and 0.96; and 0.90, 0.83, and 0.86 for precision, recall, and F1-score, respectively). Lastly, we evaluated the pipeline against two publicly available Spanish negation corpora, the IULA and NUBes, obtaining state-of-the-art metrics (1.00, 0.99, and 0.99; and 1.00, 0.93, and 0.96 for precision, recall, and F1-score, respectively). Conclusion Negation recognition is a source of low precision in the retrieval of cNEs from EHRs’ free-text. Combining a customized rule-based NegEx layer with a CNN binary classifier outperformed many other current approaches. RWE studies highly benefit from the correct recognition of negation as it reduces false positive detections of cNE which otherwise would undoubtedly reduce the credibility of cNLP systems.https://doi.org/10.1186/s12911-023-02301-5NegationNegExCNNElectronic health recordsClinical Natural Language Processing
spellingShingle	Guillermo Argüello-González José Aquino-Esperanza Daniel Salvador Rosa Bretón-Romero Carlos Del Río-Bermudez Jorge Tello Sebastian Menke Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network BMC Medical Informatics and Decision Making Negation NegEx CNN Electronic health records Clinical Natural Language Processing
title	Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network
title_full	Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network
title_fullStr	Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network
title_full_unstemmed	Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network
title_short	Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network
title_sort	negation recognition in clinical natural language processing using a combination of the negex algorithm and a convolutional neural network
topic	Negation NegEx CNN Electronic health records Clinical Natural Language Processing
url	https://doi.org/10.1186/s12911-023-02301-5
work_keys_str_mv	AT guillermoarguellogonzalez negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT joseaquinoesperanza negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT danielsalvador negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT rosabretonromero negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT carlosdelriobermudez negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT jorgetello negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork AT sebastianmenke negationrecognitioninclinicalnaturallanguageprocessingusingacombinationofthenegexalgorithmandaconvolutionalneuralnetwork

Negation recognition in clinical natural language processing using a combination of the NegEx algorithm and a convolutional neural network

Similar Items