Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
Endometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with mult...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Biomedicines |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-9059/11/11/3015 |
_version_ | 1797460071505985536 |
---|---|
author | Ulan Tore Aibek Abilgazym Angel Asunsolo-del-Barco Milan Terzic Yerden Yemenkhan Amin Zollanvari Antonio Sarria-Santamera |
author_facet | Ulan Tore Aibek Abilgazym Angel Asunsolo-del-Barco Milan Terzic Yerden Yemenkhan Amin Zollanvari Antonio Sarria-Santamera |
author_sort | Ulan Tore |
collection | DOAJ |
description | Endometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with multiple diseases. We use a total of 627,566 clinically collected data from cases of endometriosis (0.82%) and controls (99.18%) to construct and evaluate predictive models. We develop a machine learning platform to construct diagnostic tools for endometriosis. The platform consists of logistic regression, decision tree, random forest, AdaBoost, and XGBoost for prediction, and uses Shapley Additive Explanation (SHAP) values to quantify the importance of features. In the model selection phase, the constructed XGBoost model performs better than other algorithms while achieving an area under the curve (AUC) of 0.725 on the test set during the evaluation phase, resulting in a specificity of 62.9% and a sensitivity of 68.6%. The model leads to a quite low positive predictive value of 1.5%, but a quite satisfactory negative predictive value of 99.58%. Moreover, the feature importance analysis points to age, infertility, uterine fibroids, anxiety, and allergic rhinitis as the top five most important features for predicting endometriosis. Although these results show the feasibility of using machine learning to improve the diagnosis of endometriosis, more research is required to improve the performance of predictive models for the diagnosis of endometriosis. This state of affairs is in part attributed to the complex nature of the condition and, at the same time, the administrative nature of our features. Should more informative features be used, we could possibly achieve a higher AUC for predicting endometriosis. As a result, we merely perceive the constructed predictive model as a tool to provide <i>auxiliary</i> information in clinical practice. |
first_indexed | 2024-03-09T17:00:04Z |
format | Article |
id | doaj.art-c5430d27e5a04c6f938e8795ae5b2899 |
institution | Directory Open Access Journal |
issn | 2227-9059 |
language | English |
last_indexed | 2024-03-09T17:00:04Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Biomedicines |
spelling | doaj.art-c5430d27e5a04c6f938e8795ae5b28992023-11-24T14:31:10ZengMDPI AGBiomedicines2227-90592023-11-011111301510.3390/biomedicines11113015Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning ApproachUlan Tore0Aibek Abilgazym1Angel Asunsolo-del-Barco2Milan Terzic3Yerden Yemenkhan4Amin Zollanvari5Antonio Sarria-Santamera6School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanDepartment of Surgery, Medical and Social Sciences, Faculty of Medicine, University of Alcalá, 288871 Alcalá de Henares, SpainDepartment of Surgery, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanDepartment of Medicine, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanDepartment of Biomedical Sciences, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanEndometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with multiple diseases. We use a total of 627,566 clinically collected data from cases of endometriosis (0.82%) and controls (99.18%) to construct and evaluate predictive models. We develop a machine learning platform to construct diagnostic tools for endometriosis. The platform consists of logistic regression, decision tree, random forest, AdaBoost, and XGBoost for prediction, and uses Shapley Additive Explanation (SHAP) values to quantify the importance of features. In the model selection phase, the constructed XGBoost model performs better than other algorithms while achieving an area under the curve (AUC) of 0.725 on the test set during the evaluation phase, resulting in a specificity of 62.9% and a sensitivity of 68.6%. The model leads to a quite low positive predictive value of 1.5%, but a quite satisfactory negative predictive value of 99.58%. Moreover, the feature importance analysis points to age, infertility, uterine fibroids, anxiety, and allergic rhinitis as the top five most important features for predicting endometriosis. Although these results show the feasibility of using machine learning to improve the diagnosis of endometriosis, more research is required to improve the performance of predictive models for the diagnosis of endometriosis. This state of affairs is in part attributed to the complex nature of the condition and, at the same time, the administrative nature of our features. Should more informative features be used, we could possibly achieve a higher AUC for predicting endometriosis. As a result, we merely perceive the constructed predictive model as a tool to provide <i>auxiliary</i> information in clinical practice.https://www.mdpi.com/2227-9059/11/11/3015endometriosiscomorbiditiesmachine learningXGBoostAdaBoostrandom forest |
spellingShingle | Ulan Tore Aibek Abilgazym Angel Asunsolo-del-Barco Milan Terzic Yerden Yemenkhan Amin Zollanvari Antonio Sarria-Santamera Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach Biomedicines endometriosis comorbidities machine learning XGBoost AdaBoost random forest |
title | Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach |
title_full | Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach |
title_fullStr | Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach |
title_full_unstemmed | Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach |
title_short | Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach |
title_sort | diagnosis of endometriosis based on comorbidities a machine learning approach |
topic | endometriosis comorbidities machine learning XGBoost AdaBoost random forest |
url | https://www.mdpi.com/2227-9059/11/11/3015 |
work_keys_str_mv | AT ulantore diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT aibekabilgazym diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT angelasunsolodelbarco diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT milanterzic diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT yerdenyemenkhan diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT aminzollanvari diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach AT antoniosarriasantamera diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach |