An Enhanced Framework for Overcoming Pitfalls and Enabling Model Interpretation in Pneumonia and Covid-19 Classification

Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development...

Full description

Bibliographic Details
Main Authors: Matheus A. De Castro Santos, Lilian Berton
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10287355/
Description
Summary:Currently, in the health area, a large amount of information is generated daily, enabling the creation of tools using the concepts of Machine Learning to help professionals make clinical decisions. In addition, due to the current scenario that we experienced in the Covid-19 pandemic, the development of tools that can assist in the diagnosis of this new disease has become essential. Besides several papers had explored deep learning algorithms for pneumonia classification, a recent survey showed they present some pitfalls and can not be used in real scenarios (eg. data sets from unreliable sources, training with small and imbalanced data sets, duplicated images due to improper merging data, problems with demographic differences among the patients such as age group, improper evaluation metrics, no use of external datasets for validation, no model interpretability). Moreover, the papers do not present a complete system, that can be tested in real scenarios. We aim to deal with these limitations and propose a framework to overcome the pointed pitfalls. We conducted a comprehensive analysis that underscores the significance of such an approach. Our efforts encompassed mitigating dataset biases and testing many popular Convolutional Neural Network models under a comprehensive evaluation. The inclusion of an external dataset fortified the credibility of our assessment. We engineered a prototype web platform with a Containerized architecture. Moreover, due to the overparameterization and black-box nature of deep learning models, it is difficult to understand the prediction results. We also explored tools to understand how the models make decisions. Through the experiments carried out in the ternary classification VGG16 network reached 89.7% of accuracy in the external datasets. In addition, the efficiency of these models in detecting the presence of diseases in patients, measured using the recall metric, was 0.96 for Covid-19 and 0.86 for Pneumonia, this result is of great importance since in the health area there is a great focus on avoiding false negatives.
ISSN:2169-3536