Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection

Hepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop pred...

Full description

Bibliographic Details
Main Authors: Ali Mohd Ali, Mohammad R. Hassan, Faisal Aburub, Mohammad Alauthman, Amjad Aldweesh, Ahmad Al-Qerem, Issam Jebreen, Ahmad Nabot
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Machines
Subjects:
Online Access:https://www.mdpi.com/2075-1702/11/3/391
_version_ 1797610543803006976
author Ali Mohd Ali
Mohammad R. Hassan
Faisal Aburub
Mohammad Alauthman
Amjad Aldweesh
Ahmad Al-Qerem
Issam Jebreen
Ahmad Nabot
author_facet Ali Mohd Ali
Mohammad R. Hassan
Faisal Aburub
Mohammad Alauthman
Amjad Aldweesh
Ahmad Al-Qerem
Issam Jebreen
Ahmad Nabot
author_sort Ali Mohd Ali
collection DOAJ
description Hepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop predictive models for various diseases, including hepatitis C. This study aims to evaluate the performance of several machine learning algorithms in diagnosing chronic liver disease, with a specific focus on hepatitis C, to improve the cost-effectiveness and efficiency of the diagnostic process. We collected a comprehensive dataset of 1801 patient records, each with 12 distinct features, from Jordan University Hospital. To assess the robustness and dependability of our proposed framework, we conducted two research scenarios, one with feature selection and one without. We also employed the Sequential Forward Selection (SFS) method to identify the most relevant features that can enhance the model’s accuracy. Moreover, we investigated the effect of the synthetic minority oversampling technique (SMOTE) on the accuracy of the model’s predictions. Our findings indicate that all machine learning models achieved an average accuracy of 83% when applied to the dataset. Furthermore, the use of SMOTE did not significantly affect the accuracy of the model’s predictions. Despite the increasing use of machine learning models in medical diagnosis, there is a growing concern about their interpretability. As such, we addressed this issue by utilizing the Shapley Additive Explanations (SHAP) method to explain the predictions of our machine learning model, which was specifically developed for hepatitis C prediction in Jordan. This work provides a comprehensive evaluation of various machine learning algorithms in diagnosing chronic liver disease, with a particular emphasis on hepatitis C. The results provide valuable insights into the cost-effectiveness and efficiency of the diagnostic process and highlight the importance of interpretability in medical diagnosis.
first_indexed 2024-03-11T06:16:45Z
format Article
id doaj.art-e6fe463ddb6e43d1ada25ff8c4ab3e95
institution Directory Open Access Journal
issn 2075-1702
language English
last_indexed 2024-03-11T06:16:45Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Machines
spelling doaj.art-e6fe463ddb6e43d1ada25ff8c4ab3e952023-11-17T12:15:51ZengMDPI AGMachines2075-17022023-03-0111339110.3390/machines11030391Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature SelectionAli Mohd Ali0Mohammad R. Hassan1Faisal Aburub2Mohammad Alauthman3Amjad Aldweesh4Ahmad Al-Qerem5Issam Jebreen6Ahmad Nabot7Communications and Computer Engineering Department, Faculty of Engineering, Al-Ahliyya Amman University, Amman 19328, JordanCommunications and Computer Engineering Department, Faculty of Engineering, Al-Ahliyya Amman University, Amman 19328, JordanDepartment of Business Intelligence and Data Analytics, University of Petra, Amman 961343, JordanDepartment of Information Security, Faculty of Information Technology, University of Petra, Amman 961343, JordanCollege of Computing and Information Technology, Shaqra University, Shaqra 11911, Saudi ArabiaComputer Science Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanSoftware Engineering Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanSoftware Engineering Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanHepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop predictive models for various diseases, including hepatitis C. This study aims to evaluate the performance of several machine learning algorithms in diagnosing chronic liver disease, with a specific focus on hepatitis C, to improve the cost-effectiveness and efficiency of the diagnostic process. We collected a comprehensive dataset of 1801 patient records, each with 12 distinct features, from Jordan University Hospital. To assess the robustness and dependability of our proposed framework, we conducted two research scenarios, one with feature selection and one without. We also employed the Sequential Forward Selection (SFS) method to identify the most relevant features that can enhance the model’s accuracy. Moreover, we investigated the effect of the synthetic minority oversampling technique (SMOTE) on the accuracy of the model’s predictions. Our findings indicate that all machine learning models achieved an average accuracy of 83% when applied to the dataset. Furthermore, the use of SMOTE did not significantly affect the accuracy of the model’s predictions. Despite the increasing use of machine learning models in medical diagnosis, there is a growing concern about their interpretability. As such, we addressed this issue by utilizing the Shapley Additive Explanations (SHAP) method to explain the predictions of our machine learning model, which was specifically developed for hepatitis C prediction in Jordan. This work provides a comprehensive evaluation of various machine learning algorithms in diagnosing chronic liver disease, with a particular emphasis on hepatitis C. The results provide valuable insights into the cost-effectiveness and efficiency of the diagnostic process and highlight the importance of interpretability in medical diagnosis.https://www.mdpi.com/2075-1702/11/3/391hepatitis Cdata augmentationfeature selectionclassification algorithmsmachine learningSHAP
spellingShingle Ali Mohd Ali
Mohammad R. Hassan
Faisal Aburub
Mohammad Alauthman
Amjad Aldweesh
Ahmad Al-Qerem
Issam Jebreen
Ahmad Nabot
Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
Machines
hepatitis C
data augmentation
feature selection
classification algorithms
machine learning
SHAP
title Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
title_full Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
title_fullStr Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
title_full_unstemmed Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
title_short Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
title_sort explainable machine learning approach for hepatitis c diagnosis using sfs feature selection
topic hepatitis C
data augmentation
feature selection
classification algorithms
machine learning
SHAP
url https://www.mdpi.com/2075-1702/11/3/391
work_keys_str_mv AT alimohdali explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT mohammadrhassan explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT faisalaburub explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT mohammadalauthman explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT amjadaldweesh explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT ahmadalqerem explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT issamjebreen explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection
AT ahmadnabot explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection