Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique

Polycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique f...

Full description

Bibliographic Details
Main Authors: Sayma Alam Suha, Muhammad Nazrul Islam
Format: Article
Language:English
Published: Elsevier 2023-03-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844023017255
_version_ 1797851757711196160
author Sayma Alam Suha
Muhammad Nazrul Islam
author_facet Sayma Alam Suha
Muhammad Nazrul Islam
author_sort Sayma Alam Suha
collection DOAJ
description Polycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique for PCOS is very critical since the accuracy of interpretations being substantially dependent on the physician's expertise. Thus, an artificially intelligent PCOS prediction model might be a feasible additional technique to the error prone and time-consuming diagnostic technique. In this study, a modified ensemble machine learning (ML) classification approach is proposed utilizing state-of-the-art stacking technique for PCOS identification with patients' symptom data; employing five traditional ML models as base learners and then one bagging or boosting ensemble ML model as the meta-learner of the stacked model. Furthermore, three distinct types of feature selection strategies are applied to pick different sets of features with varied numbers and combinations of attributes. To evaluate and explore the dominant features necessary for predicting PCOS, the proposed technique with five variety of models and other ten types of classifiers is trained, tested and assessed utilizing different feature sets. As outcomes, the proposed stacking ensemble technique significantly enhances the accuracy in comparison to the other existing ML based techniques in case of all varieties of feature sets. However, among various models investigated to categorize PCOS and non-PCOS patients, the stacking ensemble model with ‘Gradient Boosting’ classifier as meta learner outperforms others with 95.7% accuracy while utilizing the top 25 features selected using Principal Component Analysis (PCA) feature selection technique.
first_indexed 2024-04-09T19:21:55Z
format Article
id doaj.art-0e6fe3fa618947d7bd36cc345b460eb3
institution Directory Open Access Journal
issn 2405-8440
language English
last_indexed 2024-04-09T19:21:55Z
publishDate 2023-03-01
publisher Elsevier
record_format Article
series Heliyon
spelling doaj.art-0e6fe3fa618947d7bd36cc345b460eb32023-04-05T08:26:06ZengElsevierHeliyon2405-84402023-03-0193e14518Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning techniqueSayma Alam Suha0Muhammad Nazrul Islam1Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, BangladeshCorresponding author.; Department of Computer Science and Engineering, Military Institute of Science and Technology, Dhaka, BangladeshPolycystic ovary syndrome (PCOS) is the most frequent endocrinological anomaly in reproductive women that causes persistent hormonal secretion disruption, leading to the formation of numerous cysts within the ovaries and serious health complications. But the real-world clinical detection technique for PCOS is very critical since the accuracy of interpretations being substantially dependent on the physician's expertise. Thus, an artificially intelligent PCOS prediction model might be a feasible additional technique to the error prone and time-consuming diagnostic technique. In this study, a modified ensemble machine learning (ML) classification approach is proposed utilizing state-of-the-art stacking technique for PCOS identification with patients' symptom data; employing five traditional ML models as base learners and then one bagging or boosting ensemble ML model as the meta-learner of the stacked model. Furthermore, three distinct types of feature selection strategies are applied to pick different sets of features with varied numbers and combinations of attributes. To evaluate and explore the dominant features necessary for predicting PCOS, the proposed technique with five variety of models and other ten types of classifiers is trained, tested and assessed utilizing different feature sets. As outcomes, the proposed stacking ensemble technique significantly enhances the accuracy in comparison to the other existing ML based techniques in case of all varieties of feature sets. However, among various models investigated to categorize PCOS and non-PCOS patients, the stacking ensemble model with ‘Gradient Boosting’ classifier as meta learner outperforms others with 95.7% accuracy while utilizing the top 25 features selected using Principal Component Analysis (PCA) feature selection technique.http://www.sciencedirect.com/science/article/pii/S2405844023017255Polycystic ovary syndrome (PCOS)Dominant featuresMachine learning classificationStacking ensemble technique
spellingShingle Sayma Alam Suha
Muhammad Nazrul Islam
Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
Heliyon
Polycystic ovary syndrome (PCOS)
Dominant features
Machine learning classification
Stacking ensemble technique
title Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
title_full Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
title_fullStr Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
title_full_unstemmed Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
title_short Exploring the dominant features and data-driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
title_sort exploring the dominant features and data driven detection of polycystic ovary syndrome through modified stacking ensemble machine learning technique
topic Polycystic ovary syndrome (PCOS)
Dominant features
Machine learning classification
Stacking ensemble technique
url http://www.sciencedirect.com/science/article/pii/S2405844023017255
work_keys_str_mv AT saymaalamsuha exploringthedominantfeaturesanddatadrivendetectionofpolycysticovarysyndromethroughmodifiedstackingensemblemachinelearningtechnique
AT muhammadnazrulislam exploringthedominantfeaturesanddatadrivendetectionofpolycysticovarysyndromethroughmodifiedstackingensemblemachinelearningtechnique