An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification
Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10287355/ |
_version_ | 1827783671137959936 |
---|---|
author | Matheus A. De Castro Santos Lilian Berton |
author_facet | Matheus A. De Castro Santos Lilian Berton |
author_sort | Matheus A. De Castro Santos |
collection | DOAJ |
description | Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives. |
first_indexed | 2024-03-11T15:50:44Z |
format | Article |
id | doaj.art-329325dc014f4bb9a2c351320cdf042d |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-11T15:50:44Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-329325dc014f4bb9a2c351320cdf042d2023-10-25T23:01:06ZengIEEEIEEE Access2169-35362023-01-011111533011534710.1109/ACCESS.2023.332540410287355An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 ClassificationMatheus A. De Castro Santos0https://orcid.org/0000-0001-8829-7598Lilian Berton1https://orcid.org/0000-0003-1397-6005Institute of Science and Technology, Federal University of São Paulo, São José, BrazilInstitute of Science and Technology, Federal University of São Paulo, São José, BrazilCurrently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.https://ieeexplore.ieee.org/document/10287355/Covid-19pneumoniaCNNimage classification |
spellingShingle | Matheus A. De Castro Santos Lilian Berton An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification IEEE Access Covid-19 pneumonia CNN image classification |
title | An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification |
title_full | An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification |
title_fullStr | An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification |
title_full_unstemmed | An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification |
title_short | An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification |
title_sort | enhanced framework for overcoming pitfalls and enabling model interpretation in pneumonia and covid 19 classification |
topic | Covid-19 pneumonia CNN image classification |
url | https://ieeexplore.ieee.org/document/10287355/ |
work_keys_str_mv | AT matheusadecastrosantos anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT lilianberton anenhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT matheusadecastrosantos enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification AT lilianberton enhancedframeworkforovercomingpitfallsandenablingmodelinterpretationinpneumoniaandcovid19classification |