How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation

BackgroundNatural language processing (NLP) is thought to be a promising solution to extract and store concepts from free text in a structured manner for data mining purposes. This is also true for radiology reports, which still consist mostly of free text. Accurate and compl...

Full description

Bibliographic Details
Main Authors: Sander Puts, Martijn Nobel, Catharina Zegers, Iñigo Bermejo, Simon Robben, Andre Dekker
Format: Article
Language:English
Published: JMIR Publications 2023-03-01
Series:JMIR Formative Research
Online Access:https://formative.jmir.org/2023/1/e38125
_version_ 1797734273572143104
author Sander Puts
Martijn Nobel
Catharina Zegers
Iñigo Bermejo
Simon Robben
Andre Dekker
author_facet Sander Puts
Martijn Nobel
Catharina Zegers
Iñigo Bermejo
Simon Robben
Andre Dekker
author_sort Sander Puts
collection DOAJ
description BackgroundNatural language processing (NLP) is thought to be a promising solution to extract and store concepts from free text in a structured manner for data mining purposes. This is also true for radiology reports, which still consist mostly of free text. Accurate and complete reports are very important for clinical decision support, for instance, in oncological staging. As such, NLP can be a tool to structure the content of the radiology report, thereby increasing the report’s value. ObjectiveThis study describes the implementation and validation of an N-stage classifier for pulmonary oncology. It is based on free-text radiological chest computed tomography reports according to the tumor, node, and metastasis (TNM) classification, which has been added to the already existing T-stage classifier to create a combined TN-stage classifier. MethodsSpaCy, PyContextNLP, and regular expressions were used for proper information extraction, after additional rules were set to accurately extract N-stage. ResultsThe overall TN-stage classifier accuracy scores were 0.84 and 0.85, respectively, for the training (N=95) and validation (N=97) sets. This is comparable to the outcomes of the T-stage classifier (0.87-0.92). ConclusionsThis study shows that NLP has potential in classifying pulmonary oncology from free-text radiological reports according to the TNM classification system as both the T- and N-stages can be extracted with high accuracy.
first_indexed 2024-03-12T12:41:51Z
format Article
id doaj.art-e2f315d68008473cb49e83e7f91470e5
institution Directory Open Access Journal
issn 2561-326X
language English
last_indexed 2024-03-12T12:41:51Z
publishDate 2023-03-01
publisher JMIR Publications
record_format Article
series JMIR Formative Research
spelling doaj.art-e2f315d68008473cb49e83e7f91470e52023-08-28T23:48:11ZengJMIR PublicationsJMIR Formative Research2561-326X2023-03-017e3812510.2196/38125How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and ValidationSander Putshttps://orcid.org/0000-0003-4148-1755Martijn Nobelhttps://orcid.org/0000-0003-3379-7290Catharina Zegershttps://orcid.org/0000-0002-9772-0869Iñigo Bermejohttps://orcid.org/0000-0001-9105-8088Simon Robbenhttps://orcid.org/0000-0002-8353-0116Andre Dekkerhttps://orcid.org/0000-0002-0422-7996 BackgroundNatural language processing (NLP) is thought to be a promising solution to extract and store concepts from free text in a structured manner for data mining purposes. This is also true for radiology reports, which still consist mostly of free text. Accurate and complete reports are very important for clinical decision support, for instance, in oncological staging. As such, NLP can be a tool to structure the content of the radiology report, thereby increasing the report’s value. ObjectiveThis study describes the implementation and validation of an N-stage classifier for pulmonary oncology. It is based on free-text radiological chest computed tomography reports according to the tumor, node, and metastasis (TNM) classification, which has been added to the already existing T-stage classifier to create a combined TN-stage classifier. MethodsSpaCy, PyContextNLP, and regular expressions were used for proper information extraction, after additional rules were set to accurately extract N-stage. ResultsThe overall TN-stage classifier accuracy scores were 0.84 and 0.85, respectively, for the training (N=95) and validation (N=97) sets. This is comparable to the outcomes of the T-stage classifier (0.87-0.92). ConclusionsThis study shows that NLP has potential in classifying pulmonary oncology from free-text radiological reports according to the TNM classification system as both the T- and N-stages can be extracted with high accuracy.https://formative.jmir.org/2023/1/e38125
spellingShingle Sander Puts
Martijn Nobel
Catharina Zegers
Iñigo Bermejo
Simon Robben
Andre Dekker
How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
JMIR Formative Research
title How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
title_full How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
title_fullStr How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
title_full_unstemmed How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
title_short How Natural Language Processing Can Aid With Pulmonary Oncology Tumor Node Metastasis Staging From Free-Text Radiology Reports: Algorithm Development and Validation
title_sort how natural language processing can aid with pulmonary oncology tumor node metastasis staging from free text radiology reports algorithm development and validation
url https://formative.jmir.org/2023/1/e38125
work_keys_str_mv AT sanderputs hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation
AT martijnnobel hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation
AT catharinazegers hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation
AT inigobermejo hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation
AT simonrobben hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation
AT andredekker hownaturallanguageprocessingcanaidwithpulmonaryoncologytumornodemetastasisstagingfromfreetextradiologyreportsalgorithmdevelopmentandvalidation