Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names

Physicians often label anatomical structure sets in Digital Imaging and Communications in Medicine (DICOM) images with nonstandard random names. Hence, the standardization of these names for the Organs at Risk (OARs), Planning Target Volumes (PTVs), and ‘Other’ organs is a vital problem. This paper...

Full description

Bibliographic Details
Main Authors:	Priyankar Bose, Pratip Rana, William C. Sleeman, Sriram Srinivasan, Rishabh Kapoor, Jatinder Palta, Preetam Ghosh
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	BioMedInformatics
Subjects:	multimodal data integration radiotherapy structure names radiation oncology deep learning TG-263 names
Online Access:	https://www.mdpi.com/2673-7426/3/3/34

_version_	1797581087579308032
author	Priyankar Bose Pratip Rana William C. Sleeman Sriram Srinivasan Rishabh Kapoor Jatinder Palta Preetam Ghosh
author_facet	Priyankar Bose Pratip Rana William C. Sleeman Sriram Srinivasan Rishabh Kapoor Jatinder Palta Preetam Ghosh
author_sort	Priyankar Bose
collection	DOAJ
description	Physicians often label anatomical structure sets in Digital Imaging and Communications in Medicine (DICOM) images with nonstandard random names. Hence, the standardization of these names for the Organs at Risk (OARs), Planning Target Volumes (PTVs), and ‘Other’ organs is a vital problem. This paper presents novel deep learning methods on structure sets by integrating multimodal data compiled from the radiotherapy centers of the US Veterans Health Administration (VHA) and Virginia Commonwealth University (VCU). These de-identified data comprise 16,290 prostate structures. Our method integrates the multimodal textual and imaging data with Convolutional Neural Network (CNN)-based deep learning approaches such as CNN, Visual Geometry Group (VGG) network, and Residual Network (ResNet) and shows improved results in prostate radiotherapy structure name standardization. Evaluation with macro-averaged F1 score shows that our model with single-modal textual data usually performs better than previous studies. The models perform well on textual data alone, while the addition of imaging data shows that deep neural networks achieve better performance using information present in other modalities. Additionally, using masked images and masked doses along with text leads to an overall performance improvement with the CNN-based architectures than using all the modalities together. Undersampling the majority class leads to further performance enhancement. The VGG network on the masked image-dose data combined with CNNs on the text data performs the best and presents the state-of-the-art in this domain.
first_indexed	2024-03-10T23:00:10Z
format	Article
id	doaj.art-4c5cd532d66047d1beef2232096f4d3d
institution	Directory Open Access Journal
issn	2673-7426
language	English
last_indexed	2024-03-10T23:00:10Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	BioMedInformatics
spelling	doaj.art-4c5cd532d66047d1beef2232096f4d3d2023-11-19T09:43:29ZengMDPI AGBioMedInformatics2673-74262023-06-013349351310.3390/biomedinformatics3030034Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure NamesPriyankar Bose0Pratip Rana1William C. Sleeman2Sriram Srinivasan3Rishabh Kapoor4Jatinder Palta5Preetam Ghosh6Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23284, USADepartment of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USAPhysicians often label anatomical structure sets in Digital Imaging and Communications in Medicine (DICOM) images with nonstandard random names. Hence, the standardization of these names for the Organs at Risk (OARs), Planning Target Volumes (PTVs), and ‘Other’ organs is a vital problem. This paper presents novel deep learning methods on structure sets by integrating multimodal data compiled from the radiotherapy centers of the US Veterans Health Administration (VHA) and Virginia Commonwealth University (VCU). These de-identified data comprise 16,290 prostate structures. Our method integrates the multimodal textual and imaging data with Convolutional Neural Network (CNN)-based deep learning approaches such as CNN, Visual Geometry Group (VGG) network, and Residual Network (ResNet) and shows improved results in prostate radiotherapy structure name standardization. Evaluation with macro-averaged F1 score shows that our model with single-modal textual data usually performs better than previous studies. The models perform well on textual data alone, while the addition of imaging data shows that deep neural networks achieve better performance using information present in other modalities. Additionally, using masked images and masked doses along with text leads to an overall performance improvement with the CNN-based architectures than using all the modalities together. Undersampling the majority class leads to further performance enhancement. The VGG network on the masked image-dose data combined with CNNs on the text data performs the best and presents the state-of-the-art in this domain.https://www.mdpi.com/2673-7426/3/3/34multimodal data integrationradiotherapy structure namesradiation oncologydeep learningTG-263 names
spellingShingle	Priyankar Bose Pratip Rana William C. Sleeman Sriram Srinivasan Rishabh Kapoor Jatinder Palta Preetam Ghosh Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names BioMedInformatics multimodal data integration radiotherapy structure names radiation oncology deep learning TG-263 names
title	Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names
title_full	Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names
title_fullStr	Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names
title_full_unstemmed	Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names
title_short	Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names
title_sort	multimodal deep learning methods on image and textual data to predict radiotherapy structure names
topic	multimodal data integration radiotherapy structure names radiation oncology deep learning TG-263 names
url	https://www.mdpi.com/2673-7426/3/3/34
work_keys_str_mv	AT priyankarbose multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT pratiprana multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT williamcsleeman multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT sriramsrinivasan multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT rishabhkapoor multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT jatinderpalta multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames AT preetamghosh multimodaldeeplearningmethodsonimageandtextualdatatopredictradiotherapystructurenames

Multimodal Deep Learning Methods on Image and Textual Data to Predict Radiotherapy Structure Names

Similar Items