Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer
Abstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerab...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2024-03-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-57993-0 |
_version_ | 1797233643772444672 |
---|---|
author | Mohamed Yacin Sikkandar Sankar Ganesh Sundaram Ahmad Alassaf Ibrahim AlMohimeed Khalid Alhussaini Adham Aleid Salem Ali Alolayan P. Ramkumar Meshal Khalaf Almutairi S. Sabarunisha Begum |
author_facet | Mohamed Yacin Sikkandar Sankar Ganesh Sundaram Ahmad Alassaf Ibrahim AlMohimeed Khalid Alhussaini Adham Aleid Salem Ali Alolayan P. Ramkumar Meshal Khalaf Almutairi S. Sabarunisha Begum |
author_sort | Mohamed Yacin Sikkandar |
collection | DOAJ |
description | Abstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well. |
first_indexed | 2024-04-24T16:19:26Z |
format | Article |
id | doaj.art-12faaca63b514f9880d9d672e683475c |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-04-24T16:19:26Z |
publishDate | 2024-03-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-12faaca63b514f9880d9d672e683475c2024-03-31T11:16:19ZengNature PortfolioScientific Reports2045-23222024-03-0114111610.1038/s41598-024-57993-0Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformerMohamed Yacin Sikkandar0Sankar Ganesh Sundaram1Ahmad Alassaf2Ibrahim AlMohimeed3Khalid Alhussaini4Adham Aleid5Salem Ali Alolayan6P. Ramkumar7Meshal Khalaf Almutairi8S. Sabarunisha Begum9Department of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Artificial Intelligence and Data Science, KPR Institute of Engineering and TechnologyDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Biomedical Technology, College of Applied Medical Sciences, King Saud UniversityDepartment of Biomedical Technology, College of Applied Medical Sciences, King Saud UniversityDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Computer Science and Engineering, Sri Sairam College of EngineeringDepartment of Medical Equipment Technology, College of Applied Medical Sciences, Majmaah UniversityDepartment of Biotechnology, P.S.R. Engineering CollegeAbstract Polyp detection is a challenging task in the diagnosis of Colorectal Cancer (CRC), and it demands clinical expertise due to the diverse nature of polyps. The recent years have witnessed the development of automated polyp detection systems to assist the experts in early diagnosis, considerably reducing the time consumption and diagnostic errors. In automated CRC diagnosis, polyp segmentation is an important step which is carried out with deep learning segmentation models. Recently, Vision Transformers (ViT) are slowly replacing these models due to their ability to capture long range dependencies among image patches. However, the existing ViTs for polyp do not harness the inherent self-attention abilities and incorporate complex attention mechanisms. This paper presents Polyp-Vision Transformer (Polyp-ViT), a novel Transformer model based on the conventional Transformer architecture, which is enhanced with adaptive mechanisms for feature extraction and positional embedding. Polyp-ViT is tested on the Kvasir-seg and CVC-Clinic DB Datasets achieving segmentation accuracies of 0.9891 ± 0.01 and 0.9875 ± 0.71 respectively, outperforming state-of-the-art models. Polyp-ViT is a prospective tool for polyp segmentation which can be adapted to other medical image segmentation tasks as well due to its ability to generalize well.https://doi.org/10.1038/s41598-024-57993-0Polyp segmentationVision transformerDeformable convolution |
spellingShingle | Mohamed Yacin Sikkandar Sankar Ganesh Sundaram Ahmad Alassaf Ibrahim AlMohimeed Khalid Alhussaini Adham Aleid Salem Ali Alolayan P. Ramkumar Meshal Khalaf Almutairi S. Sabarunisha Begum Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer Scientific Reports Polyp segmentation Vision transformer Deformable convolution |
title | Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
title_full | Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
title_fullStr | Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
title_full_unstemmed | Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
title_short | Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
title_sort | utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer |
topic | Polyp segmentation Vision transformer Deformable convolution |
url | https://doi.org/10.1038/s41598-024-57993-0 |
work_keys_str_mv | AT mohamedyacinsikkandar utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT sankarganeshsundaram utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT ahmadalassaf utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT ibrahimalmohimeed utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT khalidalhussaini utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT adhamaleid utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT salemalialolayan utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT pramkumar utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT meshalkhalafalmutairi utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer AT ssabarunishabegum utilizingadaptivedeformableconvolutionandpositionembeddingforcolonpolypsegmentationwithavisualtransformer |