Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer

The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippo...

Full description

Bibliographic Details
Main Authors:	Zhiyong Xiao, Yuhong Zhang, Zhaohong Deng, Fei Liu
Format:	Article
Language:	English
Published:	Elsevier 2024-04-01
Series:	NeuroImage
Subjects:	Vision transformer CNN Lightweight Multi-scale features fusion 3D medical image segmentation
Online Access:	http://www.sciencedirect.com/science/article/pii/S1053811924001034

_version_	1797200867421585408
author	Zhiyong Xiao Yuhong Zhang Zhaohong Deng Fei Liu
author_facet	Zhiyong Xiao Yuhong Zhang Zhaohong Deng Fei Liu
author_sort	Zhiyong Xiao
collection	DOAJ
description	The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippocampus segmentation of Magnetic Resonance Imaging (MRI) due to their effectiveness, and the architecture based on Transformer has also received some attention. However, some existing methods focus too much on the shape and volume of the hippocampus rather than its spatial information, and the extracted information is independent of each other, ignoring the correlation between local and global features. In addition, many methods cannot be effectively applied to practical medical image segmentation due to many parameters and high computational complexity. To this end, we combined the advantages of CNNs and ViTs (Vision Transformer) and proposed a simple and lightweight model: Light3DHS for the segmentation of the 3D hippocampus. In order to obtain richer local contextual features, the encoder first utilizes a multi-scale convolutional attention module (MCA) to learn the spatial information of the hippocampus. Considering the importance of local features and global semantics for 3D segmentation, we used a lightweight ViT to learn high-level features of scale invariance and further fuse local-to-global representation. To evaluate the effectiveness of encoder feature representation, we designed three decoders of different complexity to generate segmentation maps. Experiments on three common hippocampal datasets demonstrate that the network achieves more accurate hippocampus segmentation with fewer parameters. Light3DHS performs better than other state-of-the-art algorithms.
first_indexed	2024-04-24T07:38:28Z
format	Article
id	doaj.art-4b964bb6698a41b3aa4473473d99a005
institution	Directory Open Access Journal
issn	1095-9572
language	English
last_indexed	2024-04-24T07:38:28Z
publishDate	2024-04-01
publisher	Elsevier
record_format	Article
series	NeuroImage
spelling	doaj.art-4b964bb6698a41b3aa4473473d99a0052024-04-20T04:17:13ZengElsevierNeuroImage1095-95722024-04-01292120608Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformerZhiyong Xiao0Yuhong Zhang1Zhaohong Deng2Fei Liu3School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, China; Institut Fresnel, Centre National de la Recherche Scientifique, Marseille, 13397, FranceSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, ChinaSchool of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214122, ChinaWuxi Hospital of Traditional Chinese Medicine, Wuxi, 214071, China; Corresponding author.The morphological analysis and volume measurement of the hippocampus are crucial to the study of many brain diseases. Therefore, an accurate hippocampal segmentation method is beneficial for the development of clinical research in brain diseases. U-Net and its variants have become prevalent in hippocampus segmentation of Magnetic Resonance Imaging (MRI) due to their effectiveness, and the architecture based on Transformer has also received some attention. However, some existing methods focus too much on the shape and volume of the hippocampus rather than its spatial information, and the extracted information is independent of each other, ignoring the correlation between local and global features. In addition, many methods cannot be effectively applied to practical medical image segmentation due to many parameters and high computational complexity. To this end, we combined the advantages of CNNs and ViTs (Vision Transformer) and proposed a simple and lightweight model: Light3DHS for the segmentation of the 3D hippocampus. In order to obtain richer local contextual features, the encoder first utilizes a multi-scale convolutional attention module (MCA) to learn the spatial information of the hippocampus. Considering the importance of local features and global semantics for 3D segmentation, we used a lightweight ViT to learn high-level features of scale invariance and further fuse local-to-global representation. To evaluate the effectiveness of encoder feature representation, we designed three decoders of different complexity to generate segmentation maps. Experiments on three common hippocampal datasets demonstrate that the network achieves more accurate hippocampus segmentation with fewer parameters. Light3DHS performs better than other state-of-the-art algorithms.http://www.sciencedirect.com/science/article/pii/S1053811924001034Vision transformerCNNLightweightMulti-scale features fusion3D medical image segmentation
spellingShingle	Zhiyong Xiao Yuhong Zhang Zhaohong Deng Fei Liu Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer NeuroImage Vision transformer CNN Lightweight Multi-scale features fusion 3D medical image segmentation
title	Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer
title_full	Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer
title_fullStr	Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer
title_full_unstemmed	Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer
title_short	Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer
title_sort	light3dhs a lightweight 3d hippocampus segmentation method using multiscale convolution attention and vision transformer
topic	Vision transformer CNN Lightweight Multi-scale features fusion 3D medical image segmentation
url	http://www.sciencedirect.com/science/article/pii/S1053811924001034
work_keys_str_mv	AT zhiyongxiao light3dhsalightweight3dhippocampussegmentationmethodusingmultiscaleconvolutionattentionandvisiontransformer AT yuhongzhang light3dhsalightweight3dhippocampussegmentationmethodusingmultiscaleconvolutionattentionandvisiontransformer AT zhaohongdeng light3dhsalightweight3dhippocampussegmentationmethodusingmultiscaleconvolutionattentionandvisiontransformer AT feiliu light3dhsalightweight3dhippocampussegmentationmethodusingmultiscaleconvolutionattentionandvisiontransformer

Light3DHS: A lightweight 3D hippocampus segmentation method using multiscale convolution attention and vision transformer

Similar Items