Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Abstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerab...

Full description

Bibliographic Details
Main Authors: Mohamed Yacin Sikkandar, Sankar Ganesh Sundaram, Ahmad Alassaf, Ibrahim AlMohimeed, Khalid Alhussaini, Adham Aleid, Salem Ali Alolayan, P. Ramkumar, Meshal Khalaf Almutairi, S. Sabarunisha Begum
Format: Article
Language:English
Published: Nature Portfolio 2024-03-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-57993-0
_version_ 1797233643772444672
author Mohamed Yacin Sikkandar
Sankar Ganesh Sundaram
Ahmad Alassaf
Ibrahim AlMohimeed
Khalid Alhussaini
Adham Aleid
Salem Ali Alolayan
P. Ramkumar
Meshal Khalaf Almutairi
S. Sabarunisha Begum
author_facet Mohamed Yacin Sikkandar
Sankar Ganesh Sundaram
Ahmad Alassaf
Ibrahim AlMohimeed
Khalid Alhussaini
Adham Aleid
Salem Ali Alolayan
P. Ramkumar
Meshal Khalaf Almutairi
S. Sabarunisha Begum
author_sort Mohamed Yacin Sikkandar
collection DOAJ
description Abstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.
first_indexed 2024-04-24T16:19:26Z
format Article
id doaj.art-12faaca63b514f9880d9d672e683475c
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-04-24T16:19:26Z
publishDate 2024-03-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-12faaca63b514f9880d9d672e683475c2024-03-31T11:16:19ZengNature PortfolioScientific Reports2045-23222024-03-0114111610.1038/s41598-024-57993-0Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformerMohamed Yacin Sikkandar0Sankar Ganesh Sundaram1Ahmad Alassaf2Ibrahim AlMohimeed3Khalid Alhussaini4Adham Aleid5Salem Ali Alolayan6P. Ramkumar7Meshal Khalaf Almutairi8S. Sabarunisha Begum9Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Artificial Intelligence and Data Science, KPR Institute of Engineering and TechnologyDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Biomedical Technology, College of Applied Medical Sciences, King Saud UniversityDepartment of Biomedical Technology, College of Applied Medical Sciences, King Saud UniversityDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Computer Science and Engineering, Sri Sairam College of EngineeringDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Biotechnology, P.S.R. Engineering CollegeAbstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.https://doi.org/10.1038/s41598-024-57993-0Polyp segmentationVision transformerDeformable convolution
spellingShingle Mohamed Yacin Sikkandar
Sankar Ganesh Sundaram
Ahmad Alassaf
Ibrahim AlMohimeed
Khalid Alhussaini
Adham Aleid
Salem Ali Alolayan
P. Ramkumar
Meshal Khalaf Almutairi
S. Sabarunisha Begum
Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
Scientific Reports
Polyp segmentation
Vision transformer
Deformable convolution
title Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
title_full Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
title_fullStr Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
title_full_unstemmed Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
title_short Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
title_sort utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
topic Polyp segmentation
Vision transformer
Deformable convolution
url https://doi.org/10.1038/s41598-024-57993-0
work_keys_str_mv AT mohamedyacinsikkandar utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT sankarganeshsundaram utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT ahmadalassaf utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT ibrahimalmohimeed utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT khalidalhussaini utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT adhamaleid utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT salemalialolayan utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT pramkumar utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT meshalkhalafalmutairi utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer
AT ssabarunishabegum utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer