Extracting cancer concepts from clinical notes using natural language processing: a systematic review
Abstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that use...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2023-10-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-023-05480-0 |
_version_ | 1797647117411418112 |
---|---|
author | Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian |
author_facet | Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian |
author_sort | Maryam Gholipour |
collection | DOAJ |
description | Abstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically. Methods PubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning “Cancer”, “NLP”, “Coding”, and “Registries” until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review. Results Most of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%). Conclusion The use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well. |
first_indexed | 2024-03-11T15:12:44Z |
format | Article |
id | doaj.art-5bab94b6f9104d32abb2ceb885ab1597 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-03-11T15:12:44Z |
publishDate | 2023-10-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-5bab94b6f9104d32abb2ceb885ab15972023-10-29T12:38:06ZengBMCBMC Bioinformatics1471-21052023-10-0124111610.1186/s12859-023-05480-0Extracting cancer concepts from clinical notes using natural language processing: a systematic reviewMaryam Gholipour0Reza Khajouei1Parastoo Amiri2Sadrieh Hajesmaeel Gohari3Leila Ahmadian4Student Research Committee, Kerman University of Medical SciencesDepartment of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical SciencesStudent Research Committee, Kerman University of Medical SciencesMedical Informatics Research Center, Institute for Futures Studies in Health, Kerman University of Medical SciencesDepartment of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical SciencesAbstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically. Methods PubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning “Cancer”, “NLP”, “Coding”, and “Registries” until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review. Results Most of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%). Conclusion The use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well.https://doi.org/10.1186/s12859-023-05480-0NeoplasmsNatural language processingNLPMachine learningTerminologyInformation system |
spellingShingle | Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian Extracting cancer concepts from clinical notes using natural language processing: a systematic review BMC Bioinformatics Neoplasms Natural language processing NLP Machine learning Terminology Information system |
title | Extracting cancer concepts from clinical notes using natural language processing: a systematic review |
title_full | Extracting cancer concepts from clinical notes using natural language processing: a systematic review |
title_fullStr | Extracting cancer concepts from clinical notes using natural language processing: a systematic review |
title_full_unstemmed | Extracting cancer concepts from clinical notes using natural language processing: a systematic review |
title_short | Extracting cancer concepts from clinical notes using natural language processing: a systematic review |
title_sort | extracting cancer concepts from clinical notes using natural language processing a systematic review |
topic | Neoplasms Natural language processing NLP Machine learning Terminology Information system |
url | https://doi.org/10.1186/s12859-023-05480-0 |
work_keys_str_mv | AT maryamgholipour extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT rezakhajouei extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT parastooamiri extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT sadriehhajesmaeelgohari extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT leilaahmadian extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview |