Extracting cancer concepts from clinical notes using natural language processing: a systematic review

Abstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that use...

Full description

Bibliographic Details
Main Authors:	Maryam Gholipour, Reza Khajouei, Parastoo Amiri, Sadrieh Hajesmaeel Gohari, Leila Ahmadian
Format:	Article
Language:	English
Published:	BMC 2023-10-01
Series:	BMC Bioinformatics
Subjects:	Neoplasms Natural language processing NLP Machine learning Terminology Information system
Online Access:	https://doi.org/10.1186/s12859-023-05480-0

_version_	1797647117411418112
author	Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian
author_facet	Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian
author_sort	Maryam Gholipour
collection	DOAJ
description	Abstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically. Methods PubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning “Cancer”, “NLP”, “Coding”, and “Registries” until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review. Results Most of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%). Conclusion The use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well.
first_indexed	2024-03-11T15:12:44Z
format	Article
id	doaj.art-5bab94b6f9104d32abb2ceb885ab1597
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-03-11T15:12:44Z
publishDate	2023-10-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-5bab94b6f9104d32abb2ceb885ab15972023-10-29T12:38:06ZengBMCBMC Bioinformatics1471-21052023-10-0124111610.1186/s12859-023-05480-0Extracting cancer concepts from clinical notes using natural language processing: a systematic reviewMaryam Gholipour0Reza Khajouei1Parastoo Amiri2Sadrieh Hajesmaeel Gohari3Leila Ahmadian4Student Research Committee, Kerman University of Medical SciencesDepartment of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical SciencesStudent Research Committee, Kerman University of Medical SciencesMedical Informatics Research Center, Institute for Futures Studies in Health, Kerman University of Medical SciencesDepartment of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical SciencesAbstract Background Extracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically. Methods PubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning “Cancer”, “NLP”, “Coding”, and “Registries” until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review. Results Most of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%). Conclusion The use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well.https://doi.org/10.1186/s12859-023-05480-0NeoplasmsNatural language processingNLPMachine learningTerminologyInformation system
spellingShingle	Maryam Gholipour Reza Khajouei Parastoo Amiri Sadrieh Hajesmaeel Gohari Leila Ahmadian Extracting cancer concepts from clinical notes using natural language processing: a systematic review BMC Bioinformatics Neoplasms Natural language processing NLP Machine learning Terminology Information system
title	Extracting cancer concepts from clinical notes using natural language processing: a systematic review
title_full	Extracting cancer concepts from clinical notes using natural language processing: a systematic review
title_fullStr	Extracting cancer concepts from clinical notes using natural language processing: a systematic review
title_full_unstemmed	Extracting cancer concepts from clinical notes using natural language processing: a systematic review
title_short	Extracting cancer concepts from clinical notes using natural language processing: a systematic review
title_sort	extracting cancer concepts from clinical notes using natural language processing a systematic review
topic	Neoplasms Natural language processing NLP Machine learning Terminology Information system
url	https://doi.org/10.1186/s12859-023-05480-0
work_keys_str_mv	AT maryamgholipour extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT rezakhajouei extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT parastooamiri extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT sadriehhajesmaeelgohari extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview AT leilaahmadian extractingcancerconceptsfromclinicalnotesusingnaturallanguageprocessingasystematicreview

Extracting cancer concepts from clinical notes using natural language processing: a systematic review

Similar Items