Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection
Hepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop pred...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-03-01
|
Series: | Machines |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1702/11/3/391 |
_version_ | 1797610543803006976 |
---|---|
author | Ali Mohd Ali Mohammad R. Hassan Faisal Aburub Mohammad Alauthman Amjad Aldweesh Ahmad Al-Qerem Issam Jebreen Ahmad Nabot |
author_facet | Ali Mohd Ali Mohammad R. Hassan Faisal Aburub Mohammad Alauthman Amjad Aldweesh Ahmad Al-Qerem Issam Jebreen Ahmad Nabot |
author_sort | Ali Mohd Ali |
collection | DOAJ |
description | Hepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop predictive models for various diseases, including hepatitis C. This study aims to evaluate the performance of several machine learning algorithms in diagnosing chronic liver disease, with a specific focus on hepatitis C, to improve the cost-effectiveness and efficiency of the diagnostic process. We collected a comprehensive dataset of 1801 patient records, each with 12 distinct features, from Jordan University Hospital. To assess the robustness and dependability of our proposed framework, we conducted two research scenarios, one with feature selection and one without. We also employed the Sequential Forward Selection (SFS) method to identify the most relevant features that can enhance the model’s accuracy. Moreover, we investigated the effect of the synthetic minority oversampling technique (SMOTE) on the accuracy of the model’s predictions. Our findings indicate that all machine learning models achieved an average accuracy of 83% when applied to the dataset. Furthermore, the use of SMOTE did not significantly affect the accuracy of the model’s predictions. Despite the increasing use of machine learning models in medical diagnosis, there is a growing concern about their interpretability. As such, we addressed this issue by utilizing the Shapley Additive Explanations (SHAP) method to explain the predictions of our machine learning model, which was specifically developed for hepatitis C prediction in Jordan. This work provides a comprehensive evaluation of various machine learning algorithms in diagnosing chronic liver disease, with a particular emphasis on hepatitis C. The results provide valuable insights into the cost-effectiveness and efficiency of the diagnostic process and highlight the importance of interpretability in medical diagnosis. |
first_indexed | 2024-03-11T06:16:45Z |
format | Article |
id | doaj.art-e6fe463ddb6e43d1ada25ff8c4ab3e95 |
institution | Directory Open Access Journal |
issn | 2075-1702 |
language | English |
last_indexed | 2024-03-11T06:16:45Z |
publishDate | 2023-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Machines |
spelling | doaj.art-e6fe463ddb6e43d1ada25ff8c4ab3e952023-11-17T12:15:51ZengMDPI AGMachines2075-17022023-03-0111339110.3390/machines11030391Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature SelectionAli Mohd Ali0Mohammad R. Hassan1Faisal Aburub2Mohammad Alauthman3Amjad Aldweesh4Ahmad Al-Qerem5Issam Jebreen6Ahmad Nabot7Communications and Computer Engineering Department, Faculty of Engineering, Al-Ahliyya Amman University, Amman 19328, JordanCommunications and Computer Engineering Department, Faculty of Engineering, Al-Ahliyya Amman University, Amman 19328, JordanDepartment of Business Intelligence and Data Analytics, University of Petra, Amman 961343, JordanDepartment of Information Security, Faculty of Information Technology, University of Petra, Amman 961343, JordanCollege of Computing and Information Technology, Shaqra University, Shaqra 11911, Saudi ArabiaComputer Science Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanSoftware Engineering Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanSoftware Engineering Department, Faculty of Information Technology, Zarqa University, Zarqa 13110, JordanHepatitis C is a significant public health concern, resulting in substantial morbidity and mortality worldwide. Early diagnosis and effective treatment are essential to prevent the disease’s progression to chronic liver disease. Machine learning algorithms have been increasingly used to develop predictive models for various diseases, including hepatitis C. This study aims to evaluate the performance of several machine learning algorithms in diagnosing chronic liver disease, with a specific focus on hepatitis C, to improve the cost-effectiveness and efficiency of the diagnostic process. We collected a comprehensive dataset of 1801 patient records, each with 12 distinct features, from Jordan University Hospital. To assess the robustness and dependability of our proposed framework, we conducted two research scenarios, one with feature selection and one without. We also employed the Sequential Forward Selection (SFS) method to identify the most relevant features that can enhance the model’s accuracy. Moreover, we investigated the effect of the synthetic minority oversampling technique (SMOTE) on the accuracy of the model’s predictions. Our findings indicate that all machine learning models achieved an average accuracy of 83% when applied to the dataset. Furthermore, the use of SMOTE did not significantly affect the accuracy of the model’s predictions. Despite the increasing use of machine learning models in medical diagnosis, there is a growing concern about their interpretability. As such, we addressed this issue by utilizing the Shapley Additive Explanations (SHAP) method to explain the predictions of our machine learning model, which was specifically developed for hepatitis C prediction in Jordan. This work provides a comprehensive evaluation of various machine learning algorithms in diagnosing chronic liver disease, with a particular emphasis on hepatitis C. The results provide valuable insights into the cost-effectiveness and efficiency of the diagnostic process and highlight the importance of interpretability in medical diagnosis.https://www.mdpi.com/2075-1702/11/3/391hepatitis Cdata augmentationfeature selectionclassification algorithmsmachine learningSHAP |
spellingShingle | Ali Mohd Ali Mohammad R. Hassan Faisal Aburub Mohammad Alauthman Amjad Aldweesh Ahmad Al-Qerem Issam Jebreen Ahmad Nabot Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection Machines hepatitis C data augmentation feature selection classification algorithms machine learning SHAP |
title | Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection |
title_full | Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection |
title_fullStr | Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection |
title_full_unstemmed | Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection |
title_short | Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection |
title_sort | explainable machine learning approach for hepatitis c diagnosis using sfs feature selection |
topic | hepatitis C data augmentation feature selection classification algorithms machine learning SHAP |
url | https://www.mdpi.com/2075-1702/11/3/391 |
work_keys_str_mv | AT alimohdali explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT mohammadrhassan explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT faisalaburub explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT mohammadalauthman explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT amjadaldweesh explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT ahmadalqerem explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT issamjebreen explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection AT ahmadnabot explainablemachinelearningapproachforhepatitiscdiagnosisusingsfsfeatureselection |