Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effe...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Diagnostics |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-4418/13/8/1506 |
_version_ | 1827745333115879424 |
---|---|
author | Hela Elmannai Nora El-Rashidy Ibrahim Mashal Manal Abdullah Alohali Sara Farag Shaker El-Sappagh Hager Saleh |
author_facet | Hela Elmannai Nora El-Rashidy Ibrahim Mashal Manal Abdullah Alohali Sara Farag Shaker El-Sappagh Hager Saleh |
author_sort | Hela Elmannai |
collection | DOAJ |
description | Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease’s problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models. |
first_indexed | 2024-03-11T05:06:36Z |
format | Article |
id | doaj.art-f14e3738d3ff4b0d9d8e3ff1f580504d |
institution | Directory Open Access Journal |
issn | 2075-4418 |
language | English |
last_indexed | 2024-03-11T05:06:36Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Diagnostics |
spelling | doaj.art-f14e3738d3ff4b0d9d8e3ff1f580504d2023-11-17T18:56:01ZengMDPI AGDiagnostics2075-44182023-04-01138150610.3390/diagnostics13081506Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial IntelligenceHela Elmannai0Nora El-Rashidy1Ibrahim Mashal2Manal Abdullah Alohali3Sara Farag4Shaker El-Sappagh5Hager Saleh6Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi ArabiaMachine Learning and Information Retrieval Department, Faculty of Artificial Intelligence, Kafrelsheiksh University, Kafrelsheiksh 13518, EgyptFaculty of Information Technology, Applied Science Private University, Amman 11937, JordanDepartment of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi ArabiaFaculty of Computers and Informations, South Valley University, Qena 83523, EgyptFaculty of Computer Science and Engineering, Galala University, Suez 435611, EgyptFaculty of Computers and Artificial Intelligence, South Valley University, Hurghada 84511, EgyptPolycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease’s problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models.https://www.mdpi.com/2075-4418/13/8/1506polycystic ovary syndromemachine learningexplainable machine learningensemble learning |
spellingShingle | Hela Elmannai Nora El-Rashidy Ibrahim Mashal Manal Abdullah Alohali Sara Farag Shaker El-Sappagh Hager Saleh Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence Diagnostics polycystic ovary syndrome machine learning explainable machine learning ensemble learning |
title | Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence |
title_full | Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence |
title_fullStr | Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence |
title_full_unstemmed | Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence |
title_short | Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence |
title_sort | polycystic ovary syndrome detection machine learning model based on optimized feature selection and explainable artificial intelligence |
topic | polycystic ovary syndrome machine learning explainable machine learning ensemble learning |
url | https://www.mdpi.com/2075-4418/13/8/1506 |
work_keys_str_mv | AT helaelmannai polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT noraelrashidy polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT ibrahimmashal polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT manalabdullahalohali polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT sarafarag polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT shakerelsappagh polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence AT hagersaleh polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence |