SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis
Abstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide scr...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-06-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-023-36921-8 |
_version_ | 1797795668896514048 |
---|---|
author | Soumitri Chattopadhyay Pawan Kumar Singh Muhammad Fazal Ijaz SeongKi Kim Ram Sarkar |
author_facet | Soumitri Chattopadhyay Pawan Kumar Singh Muhammad Fazal Ijaz SeongKi Kim Ram Sarkar |
author_sort | Soumitri Chattopadhyay |
collection | DOAJ |
description | Abstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide screening is stunted by the requirement of medical professionals to analyse histological slides manually. Thus, an automated computer-aided detection (CAD) framework based on deep learning is proposed in this research that uses histological slide images for predictions. Ensemble learning is a popular strategy for fusing the salient properties of several models to make the final predictions. However, such frameworks are computationally costly since it requires the training of multiple base learners. Instead, in this study, we adopt a snapshot ensemble method, wherein, instead of the traditional method of fusing decision scores from the snapshots of a Convolutional Neural Network (CNN) model, we extract deep features from the penultimate layer of the CNN model. Since the deep features are extracted from the same CNN model but for different learning environments, there may be redundancy in the feature set. To alleviate this, the features are fed into Particle Swarm Optimization, a popular meta-heuristic, for dimensionality reduction of the feature space and better classification. Upon evaluation on a publicly available colorectal cancer histology dataset using a five-fold cross-validation scheme, the proposed method obtains a highest accuracy of 97.60% and F1-Score of 97.61%, outperforming existing state-of-the-art methods on the same dataset. Further, qualitative investigation of class activation maps provide visual explainability to medical practitioners, as well as justifies the use of the CAD framework in screening of colorectal histology. Our source codes are publicly accessible at: https://github.com/soumitri2001/SnapEnsemFS . |
first_indexed | 2024-03-13T03:21:30Z |
format | Article |
id | doaj.art-b7513d0e7c4f4bc28d32493cf7c0c979 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-03-13T03:21:30Z |
publishDate | 2023-06-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-b7513d0e7c4f4bc28d32493cf7c0c9792023-06-25T11:17:37ZengNature PortfolioScientific Reports2045-23222023-06-0113111810.1038/s41598-023-36921-8SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysisSoumitri Chattopadhyay0Pawan Kumar Singh1Muhammad Fazal Ijaz2SeongKi Kim3Ram Sarkar4Department of Information Technology, Jadavpur UniversityDepartment of Information Technology, Jadavpur UniversityDepartment of Mechanical Engineering, Faculty of Engineering and Information Technology, The University of MelbourneNational Centre of Excellence in Software, Sangmyung UniversityDepartment of Computer Science & Engineering, Jadavpur UniversityAbstract Colorectal cancer is the third most common type of cancer diagnosed annually, and the second leading cause of death due to cancer. Early diagnosis of this ailment is vital for preventing the tumours to spread and plan treatment to possibly eradicate the disease. However, population-wide screening is stunted by the requirement of medical professionals to analyse histological slides manually. Thus, an automated computer-aided detection (CAD) framework based on deep learning is proposed in this research that uses histological slide images for predictions. Ensemble learning is a popular strategy for fusing the salient properties of several models to make the final predictions. However, such frameworks are computationally costly since it requires the training of multiple base learners. Instead, in this study, we adopt a snapshot ensemble method, wherein, instead of the traditional method of fusing decision scores from the snapshots of a Convolutional Neural Network (CNN) model, we extract deep features from the penultimate layer of the CNN model. Since the deep features are extracted from the same CNN model but for different learning environments, there may be redundancy in the feature set. To alleviate this, the features are fed into Particle Swarm Optimization, a popular meta-heuristic, for dimensionality reduction of the feature space and better classification. Upon evaluation on a publicly available colorectal cancer histology dataset using a five-fold cross-validation scheme, the proposed method obtains a highest accuracy of 97.60% and F1-Score of 97.61%, outperforming existing state-of-the-art methods on the same dataset. Further, qualitative investigation of class activation maps provide visual explainability to medical practitioners, as well as justifies the use of the CAD framework in screening of colorectal histology. Our source codes are publicly accessible at: https://github.com/soumitri2001/SnapEnsemFS .https://doi.org/10.1038/s41598-023-36921-8 |
spellingShingle | Soumitri Chattopadhyay Pawan Kumar Singh Muhammad Fazal Ijaz SeongKi Kim Ram Sarkar SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis Scientific Reports |
title | SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis |
title_full | SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis |
title_fullStr | SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis |
title_full_unstemmed | SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis |
title_short | SnapEnsemFS: a snapshot ensembling-based deep feature selection model for colorectal cancer histological analysis |
title_sort | snapensemfs a snapshot ensembling based deep feature selection model for colorectal cancer histological analysis |
url | https://doi.org/10.1038/s41598-023-36921-8 |
work_keys_str_mv | AT soumitrichattopadhyay snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis AT pawankumarsingh snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis AT muhammadfazalijaz snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis AT seongkikim snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis AT ramsarkar snapensemfsasnapshotensemblingbaseddeepfeatureselectionmodelforcolorectalcancerhistologicalanalysis |