Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
Polycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique f...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-03-01
|
Series: | Heliyon |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2405844023017255 |
_version_ | 1797851757711196160 |
---|---|
author | Sayma Alam Suha Muhammad Nazrul Islam |
author_facet | Sayma Alam Suha Muhammad Nazrul Islam |
author_sort | Sayma Alam Suha |
collection | DOAJ |
description | Polycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique for PCOS is very critical since the accuracy of interpretations being substantially dependent on the physician's expertise. Thus, an artificially intelligent PCOS prediction model might be a feasible additional technique to the error prone and time-consuming diagnostic technique. In this study, a modified ensemble machine learning (ML) classification approach is proposed utilizing state-of-the-art stacking technique for PCOS identification with patients' symptom data; employing five traditional ML models as base learners and then one bagging or boosting ensemble ML model as the meta-learner of the stacked model. Furthermore, three distinct types of feature selection strategies are applied to pick different sets of features with varied numbers and combinations of attributes. To evaluate and explore the dominant features necessary for predicting PCOS, the proposed technique with five variety of models and other ten types of classifiers is trained, tested and assessed utilizing different feature sets. As outcomes, the proposed stacking ensemble technique significantly enhances the accuracy in comparison to the other existing ML based techniques in case of all varieties of feature sets. However, among various models investigated to categorize PCOS and non-PCOS patients, the stacking ensemble model with ‘Gradient Boosting’ classifier as meta learner outperforms others with 95.7% accuracy while utilizing the top 25 features selected using Principal Component Analysis (PCA) feature selection technique. |
first_indexed | 2024-04-09T19:21:55Z |
format | Article |
id | doaj.art-0e6fe3fa618947d7bd36cc345b460eb3 |
institution | Directory Open Access Journal |
issn | 2405-8440 |
language | English |
last_indexed | 2024-04-09T19:21:55Z |
publishDate | 2023-03-01 |
publisher | Elsevier |
record_format | Article |
series | Heliyon |
spelling | doaj.art-0e6fe3fa618947d7bd36cc345b460eb32023-04-05T08:26:06ZengElsevierHeliyon2405-84402023-03-0193e14518Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning techniqueSayma Alam Suha0Muhammad Nazrul Islam1Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, BangladeshCorresponding author.; Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, BangladeshPolycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique for PCOS is very critical since the accuracy of interpretations being substantially dependent on the physician's expertise. Thus, an artificially intelligent PCOS prediction model might be a feasible additional technique to the error prone and time-consuming diagnostic technique. In this study, a modified ensemble machine learning (ML) classification approach is proposed utilizing state-of-the-art stacking technique for PCOS identification with patients' symptom data; employing five traditional ML models as base learners and then one bagging or boosting ensemble ML model as the meta-learner of the stacked model. Furthermore, three distinct types of feature selection strategies are applied to pick different sets of features with varied numbers and combinations of attributes. To evaluate and explore the dominant features necessary for predicting PCOS, the proposed technique with five variety of models and other ten types of classifiers is trained, tested and assessed utilizing different feature sets. As outcomes, the proposed stacking ensemble technique significantly enhances the accuracy in comparison to the other existing ML based techniques in case of all varieties of feature sets. However, among various models investigated to categorize PCOS and non-PCOS patients, the stacking ensemble model with ‘Gradient Boosting’ classifier as meta learner outperforms others with 95.7% accuracy while utilizing the top 25 features selected using Principal Component Analysis (PCA) feature selection technique.http://www.sciencedirect.com/science/article/pii/S2405844023017255Polycystic ovary syndrome (PCOS)Dominant featuresMachine learning classificationStacking ensemble technique |
spellingShingle | Sayma Alam Suha Muhammad Nazrul Islam Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique Heliyon Polycystic ovary syndrome (PCOS) Dominant features Machine learning classification Stacking ensemble technique |
title | Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
title_full | Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
title_fullStr | Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
title_full_unstemmed | Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
title_short | Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
title_sort | exploring the dominant features and data driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique |
topic | Polycystic ovary syndrome (PCOS) Dominant features Machine learning classification Stacking ensemble technique |
url | http://www.sciencedirect.com/science/article/pii/S2405844023017255 |
work_keys_str_mv | AT saymaalamsuha exploringthedominantfeaturesanddatadrivendetectionofpolycysticovarysyndromethroughmodifiedstackingensemblemachinelearningtechnique AT muhammadnazrulislam exploringthedominantfeaturesanddatadrivendetectionofpolycysticovarysyndromethroughmodifiedstackingensemblemachinelearningtechnique |