An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification

Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development...

Full description

Bibliographic Details
Main Authors: Matheus A. De Castro Santos, Lilian Berton
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10287355/
_version_ 1827783671137959936
author Matheus A. De Castro Santos
Lilian Berton
author_facet Matheus A. De Castro Santos
Lilian Berton
author_sort Matheus A. De Castro Santos
collection DOAJ
description Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.
first_indexed 2024-03-11T15:50:44Z
format Article
id doaj.art-329325dc014f4bb9a2c351320cdf042d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-03-11T15:50:44Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-329325dc014f4bb9a2c351320cdf042d2023-10-25T23:01:06ZengIEEEIEEE Access2169-35362023-01-011111533011534710.1109/ACCESS.2023.332540410287355An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 ClassificationMatheus A. De Castro Santos0https://orcid.org/0000-0001-8829-7598Lilian Berton1https://orcid.org/0000-0003-1397-6005Institute of Science and Technology, Federal University of São Paulo, São José, BrazilInstitute of Science and Technology, Federal University of São Paulo, São José, BrazilCurrently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.https://ieeexplore.ieee.org/document/10287355/Covid-19pneumoniaCNNimage classification
spellingShingle Matheus A. De Castro Santos
Lilian Berton
An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
IEEE Access
Covid-19
pneumonia
CNN
image classification
title An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_full An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_fullStr An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_full_unstemmed An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_short An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
title_sort enhanced framework for overcoming pitfalls and enabling model interpretation in pneumonia and covid 19 classification
topic Covid-19
pneumonia
CNN
image classification
url https://ieeexplore.ieee.org/document/10287355/
work_keys_str_mv AT matheusadecastrosantos anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification
AT lilianberton anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification
AT matheusadecastrosantos enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification
AT lilianberton enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification