SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis

Abstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide scr...

Full description

Bibliographic Details
Main Authors: Soumitri Chattopadhyay, Pawan Kumar Singh, Muhammad Fazal Ijaz, SeongKi Kim, Ram Sarkar
Format: Article
Language:English
Published: Nature Portfolio 2023-06-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-023-36921-8
_version_ 1797795668896514048
author Soumitri Chattopadhyay
Pawan Kumar Singh
Muhammad Fazal Ijaz
SeongKi Kim
Ram Sarkar
author_facet Soumitri Chattopadhyay
Pawan Kumar Singh
Muhammad Fazal Ijaz
SeongKi Kim
Ram Sarkar
author_sort Soumitri Chattopadhyay
collection DOAJ
description Abstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide screening is stunted by the requirement of medical professionals to analyse histological slides manually. Thus, an automated computer-aided detection (CAD) framework based on deep learning is proposed in this research that uses histological slide images for predictions. Ensemble learning is a popular strategy for fusing the salient properties of several models to make the final predictions. However, such frameworks are computationally costly since it requires the training of multiple base learners. Instead, in this study, we adopt a snapshot ensemble method, wherein, instead of the traditional method of fusing decision scores from the snapshots of a Convolutional Neural Network (CNN) model, we extract deep features from the penultimate layer of the CNN model. Since the deep features are extracted from the same CNN model but for different learning environments, there may be redundancy in the feature set. To alleviate this, the features are fed into Particle Swarm Optimization, a popular meta-heuristic, for dimensionality reduction of the feature space and better classification. Upon evaluation on a publicly available colorectal cancer histology dataset using a five-fold cross-validation scheme, the proposed method obtains a highest accuracy of 97.60% and F1-Score of 97.61%, outperforming existing state-of-the-art methods on the same dataset. Further, qualitative investigation of class activation maps provide visual explainability to medical practitioners, as well as justifies the use of the CAD framework in screening of colorectal histology. Our source codes are publicly accessible at: https://github.com/soumitri2001/SnapEnsemFS .
first_indexed 2024-03-13T03:21:30Z
format Article
id doaj.art-b7513d0e7c4f4bc28d32493cf7c0c979
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-03-13T03:21:30Z
publishDate 2023-06-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-b7513d0e7c4f4bc28d32493cf7c0c9792023-06-25T11:17:37ZengNature PortfolioScientific Reports2045-23222023-06-0113111810.1038/s41598-023-36921-8SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysisSoumitri Chattopadhyay0Pawan Kumar Singh1Muhammad Fazal Ijaz2SeongKi Kim3Ram Sarkar4Department of Information Technology, Jadavpur UniversityDepartment of Information Technology, Jadavpur UniversityDepartment of Mechanical Engineering, Faculty of Engineering and Information Technology, The University of MelbourneNational Centre of Excellence in Software, Sangmyung UniversityDepartment of Computer Science & Engineering, Jadavpur UniversityAbstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide screening is stunted by the requirement of medical professionals to analyse histological slides manually. Thus, an automated computer-aided detection (CAD) framework based on deep learning is proposed in this research that uses histological slide images for predictions. Ensemble learning is a popular strategy for fusing the salient properties of several models to make the final predictions. However, such frameworks are computationally costly since it requires the training of multiple base learners. Instead, in this study, we adopt a snapshot ensemble method, wherein, instead of the traditional method of fusing decision scores from the snapshots of a Convolutional Neural Network (CNN) model, we extract deep features from the penultimate layer of the CNN model. Since the deep features are extracted from the same CNN model but for different learning environments, there may be redundancy in the feature set. To alleviate this, the features are fed into Particle Swarm Optimization, a popular meta-heuristic, for dimensionality reduction of the feature space and better classification. Upon evaluation on a publicly available colorectal cancer histology dataset using a five-fold cross-validation scheme, the proposed method obtains a highest accuracy of 97.60% and F1-Score of 97.61%, outperforming existing state-of-the-art methods on the same dataset. Further, qualitative investigation of class activation maps provide visual explainability to medical practitioners, as well as justifies the use of the CAD framework in screening of colorectal histology. Our source codes are publicly accessible at: https://github.com/soumitri2001/SnapEnsemFS .https://doi.org/10.1038/s41598-023-36921-8
spellingShingle Soumitri Chattopadhyay
Pawan Kumar Singh
Muhammad Fazal Ijaz
SeongKi Kim
Ram Sarkar
SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
Scientific Reports
title SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
title_full SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
title_fullStr SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
title_full_unstemmed SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
title_short SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
title_sort snapensemfs a snapshot ensembling based deep feature selection model for colorectal cancer histological analysis
url https://doi.org/10.1038/s41598-023-36921-8
work_keys_str_mv AT soumitrichattopadhyay snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis
AT pawankumarsingh snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis
AT muhammadfazalijaz snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis
AT seongkikim snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis
AT ramsarkar snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis