Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts

Abstract Background and objective Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to iden...

Full description

Bibliographic Details
Main Authors: Isabel Segura-Bedmar, David Camino-Perdones, Sara Guerrero-Aspizua
Format: Article
Language:English
Published: BMC 2022-07-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-04810-y
_version_ 1811224087818665984
author Isabel Segura-Bedmar
David Camino-Perdones
Sara Guerrero-Aspizua
author_facet Isabel Segura-Bedmar
David Camino-Perdones
Sara Guerrero-Aspizua
author_sort Isabel Segura-Bedmar
collection DOAJ
description Abstract Background and objective Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. Methods The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). Results BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. Conclusions While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms).
first_indexed 2024-04-12T08:43:00Z
format Article
id doaj.art-a7b7b5146fc24c1ea630b534ee63ea2f
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-12T08:43:00Z
publishDate 2022-07-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-a7b7b5146fc24c1ea630b534ee63ea2f2022-12-22T03:39:47ZengBMCBMC Bioinformatics1471-21052022-07-0123112310.1186/s12859-022-04810-yExploring deep learning methods for recognizing rare diseases and their clinical manifestations from textsIsabel Segura-Bedmar0David Camino-Perdones1Sara Guerrero-Aspizua2Human Language and Accesibility Technologies, Computer Science Department, Universidad Carlos III de MadridHuman Language and Accesibility Technologies, Computer Science Department, Universidad Carlos III de MadridTissue Engineering and Regenerative Medicine group, Department of Bioengineering, Universidad Carlos III de MadridAbstract Background and objective Although rare diseases are characterized by low prevalence, approximately 400 million people are affected by a rare disease. The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them. In addition to this, rare diseases usually show a wide variety of manifestations, which might make the diagnosis even more difficult. A delayed diagnosis can negatively affect the patient’s life. Therefore, there is an urgent need to increase the scientific and medical knowledge about rare diseases. Natural Language Processing (NLP) and Deep Learning can help to extract relevant information about rare diseases to facilitate their diagnosis and treatments. Methods The paper explores several deep learning techniques such as Bidirectional Long Short Term Memory (BiLSTM) networks or deep contextualized word representations based on Bidirectional Encoder Representations from Transformers (BERT) to recognize rare diseases and their clinical manifestations (signs and symptoms). Results BioBERT, a domain-specific language representation based on BERT and trained on biomedical corpora, obtains the best results with an F1 of 85.2% for rare diseases. Since many signs are usually described by complex noun phrases that involve the use of use of overlapped, nested and discontinuous entities, the model provides lower results with an F1 of 57.2%. Conclusions While our results are promising, there is still much room for improvement, especially with respect to the identification of clinical manifestations (signs and symptoms).https://doi.org/10.1186/s12859-022-04810-yRare diseasesNamed entity recognitionDeep learning
spellingShingle Isabel Segura-Bedmar
David Camino-Perdones
Sara Guerrero-Aspizua
Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
BMC Bioinformatics
Rare diseases
Named entity recognition
Deep learning
title Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_full Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_fullStr Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_full_unstemmed Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_short Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
title_sort exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts
topic Rare diseases
Named entity recognition
Deep learning
url https://doi.org/10.1186/s12859-022-04810-y
work_keys_str_mv AT isabelsegurabedmar exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts
AT davidcaminoperdones exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts
AT saraguerreroaspizua exploringdeeplearningmethodsforrecognizingrarediseasesandtheirclinicalmanifestationsfromtexts