Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers

Convolutional neural networks (CNNs) have a proven track record in medical image segmentation. Recently, Vision Transformers were introduced and are gaining popularity for many computer vision applications, including object detection, classification, and segmentation. Machine learning algorithms suc...

Full description

Bibliographic Details
Main Authors:	Xiaofan Xiong, Brian J. Smith, Stephen A. Graves, Michael M. Graham, John M. Buatti, Reinhard R. Beichel
Format:	Article
Language:	English
Published:	MDPI AG 2023-10-01
Series:	Tomography
Subjects:	head and neck cancer segmentation FDG PET CNN Vision Transformer
Online Access:	https://www.mdpi.com/2379-139X/9/5/151

_version_	1827719794405670912
author	Xiaofan Xiong Brian J. Smith Stephen A. Graves Michael M. Graham John M. Buatti Reinhard R. Beichel
author_facet	Xiaofan Xiong Brian J. Smith Stephen A. Graves Michael M. Graham John M. Buatti Reinhard R. Beichel
author_sort	Xiaofan Xiong
collection	DOAJ
description	Convolutional neural networks (CNNs) have a proven track record in medical image segmentation. Recently, Vision Transformers were introduced and are gaining popularity for many computer vision applications, including object detection, classification, and segmentation. Machine learning algorithms such as CNNs or Transformers are subject to an inductive bias, which can have a significant impact on the performance of machine learning models. This is especially relevant for medical image segmentation applications where limited training data are available, and a model’s inductive bias should help it to generalize well. In this work, we quantitatively assess the performance of two CNN-based networks (U-Net and U-Net-CBAM) and three popular Transformer-based segmentation network architectures (UNETR, TransBTS, and VT-UNet) in the context of HNC lesion segmentation in volumetric [F-18] fluorodeoxyglucose (FDG) PET scans. For performance assessment, 272 FDG PET-CT scans of a clinical trial (ACRIN 6685) were utilized, which includes a total of 650 lesions (primary: 272 and secondary: 378). The image data used are highly diverse and representative for clinical use. For performance analysis, several error metrics were utilized. The achieved Dice coefficient ranged from 0.833 to 0.809 with the best performance being achieved by CNN-based approaches. U-Net-CBAM, which utilizes spatial and channel attention, showed several advantages for smaller lesions compared to the standard U-Net. Furthermore, our results provide some insight regarding the image features relevant for this specific segmentation application. In addition, results highlight the need to utilize primary as well as secondary lesions to derive clinically relevant segmentation performance estimates avoiding biases.
first_indexed	2024-03-10T20:51:06Z
format	Article
id	doaj.art-83b20ab732704186bca6974681f43022
institution	Directory Open Access Journal
issn	2379-1381 2379-139X
language	English
last_indexed	2024-03-10T20:51:06Z
publishDate	2023-10-01
publisher	MDPI AG
record_format	Article
series	Tomography
spelling	doaj.art-83b20ab732704186bca6974681f430222023-11-19T18:21:05ZengMDPI AGTomography2379-13812379-139X2023-10-01951933194810.3390/tomography9050151Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision TransformersXiaofan Xiong0Brian J. Smith1Stephen A. Graves2Michael M. Graham3John M. Buatti4Reinhard R. Beichel5Department of Biomedical Engineering, The University of Iowa, Iowa City, IA 52242, USADepartment of Biostatistics, The University of Iowa, Iowa City, IA 52242, USADepartment of Radiology, The University of Iowa, Iowa City, IA 52242, USADepartment of Radiology, The University of Iowa, Iowa City, IA 52242, USADepartment of Radiation Oncology, University of Iowa Hospitals and Clinics, Iowa City, IA 52242, USADepartment of Electrical and Computer Engineering, The University of Iowa, Iowa City, IA 52242, USAConvolutional neural networks (CNNs) have a proven track record in medical image segmentation. Recently, Vision Transformers were introduced and are gaining popularity for many computer vision applications, including object detection, classification, and segmentation. Machine learning algorithms such as CNNs or Transformers are subject to an inductive bias, which can have a significant impact on the performance of machine learning models. This is especially relevant for medical image segmentation applications where limited training data are available, and a model’s inductive bias should help it to generalize well. In this work, we quantitatively assess the performance of two CNN-based networks (U-Net and U-Net-CBAM) and three popular Transformer-based segmentation network architectures (UNETR, TransBTS, and VT-UNet) in the context of HNC lesion segmentation in volumetric [F-18] fluorodeoxyglucose (FDG) PET scans. For performance assessment, 272 FDG PET-CT scans of a clinical trial (ACRIN 6685) were utilized, which includes a total of 650 lesions (primary: 272 and secondary: 378). The image data used are highly diverse and representative for clinical use. For performance analysis, several error metrics were utilized. The achieved Dice coefficient ranged from 0.833 to 0.809 with the best performance being achieved by CNN-based approaches. U-Net-CBAM, which utilizes spatial and channel attention, showed several advantages for smaller lesions compared to the standard U-Net. Furthermore, our results provide some insight regarding the image features relevant for this specific segmentation application. In addition, results highlight the need to utilize primary as well as secondary lesions to derive clinically relevant segmentation performance estimates avoiding biases.https://www.mdpi.com/2379-139X/9/5/151head and neck cancersegmentationFDG PETCNNVision Transformer
spellingShingle	Xiaofan Xiong Brian J. Smith Stephen A. Graves Michael M. Graham John M. Buatti Reinhard R. Beichel Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers Tomography head and neck cancer segmentation FDG PET CNN Vision Transformer
title	Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
title_full	Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
title_fullStr	Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
title_full_unstemmed	Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
title_short	Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
title_sort	head and neck cancer segmentation in fdg pet images performance comparison of convolutional neural networks and vision transformers
topic	head and neck cancer segmentation FDG PET CNN Vision Transformer
url	https://www.mdpi.com/2379-139X/9/5/151
work_keys_str_mv	AT xiaofanxiong headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers AT brianjsmith headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers AT stephenagraves headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers AT michaelmgraham headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers AT johnmbuatti headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers AT reinhardrbeichel headandneckcancersegmentationinfdgpetimagesperformancecomparisonofconvolutionalneuralnetworksandvisiontransformers

Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers

Similar Items