An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification

Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development...

Full description

Bibliographic Details
Main Authors:	Matheus A. De Castro Santos, Lilian Berton
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Covid-19 pneumonia CNN image classification
Online Access:	https://ieeexplore.ieee.org/document/10287355/

_version_	1827783671137959936
author	Matheus A. De Castro Santos Lilian Berton
author_facet	Matheus A. De Castro Santos Lilian Berton
author_sort	Matheus A. De Castro Santos
collection	DOAJ
description	Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.
first_indexed	2024-03-11T15:50:44Z
format	Article
id	doaj.art-329325dc014f4bb9a2c351320cdf042d
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-11T15:50:44Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-329325dc014f4bb9a2c351320cdf042d2023-10-25T23:01:06ZengIEEEIEEE Access2169-35362023-01-011111533011534710.1109/ACCESS.2023.332540410287355An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 ClassificationMatheus A. De Castro Santos0https://orcid.org/0000-0001-8829-7598Lilian Berton1https://orcid.org/0000-0003-1397-6005Institute of Science and Technology, Federal University of São Paulo, São José, BrazilInstitute of Science and Technology, Federal University of São Paulo, São José, BrazilCurrently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.https://ieeexplore.ieee.org/document/10287355/Covid-19pneumoniaCNNimage classification
spellingShingle	Matheus A. De Castro Santos Lilian Berton An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification IEEE Access Covid-19 pneumonia CNN image classification
title	An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_full	An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_fullStr	An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_full_unstemmed	An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_short	An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_sort	enhanced framework for overcoming pitfalls and enabling model interpretation in pneumonia and covid 19 classification
topic	Covid-19 pneumonia CNN image classification
url	https://ieeexplore.ieee.org/document/10287355/
work_keys_str_mv	AT matheusadecastrosantos anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT lilianberton anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT matheusadecastrosantos enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT lilianberton enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification

An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification

Similar Items