Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach

Endometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with mult...

Full description

Bibliographic Details
Main Authors: Ulan Tore, Aibek Abilgazym, Angel Asunsolo-del-Barco, Milan Terzic, Yerden Yemenkhan, Amin Zollanvari, Antonio Sarria-Santamera
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Biomedicines
Subjects:
Online Access:https://www.mdpi.com/2227-9059/11/11/3015
_version_ 1797460071505985536
author Ulan Tore
Aibek Abilgazym
Angel Asunsolo-del-Barco
Milan Terzic
Yerden Yemenkhan
Amin Zollanvari
Antonio Sarria-Santamera
author_facet Ulan Tore
Aibek Abilgazym
Angel Asunsolo-del-Barco
Milan Terzic
Yerden Yemenkhan
Amin Zollanvari
Antonio Sarria-Santamera
author_sort Ulan Tore
collection DOAJ
description Endometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with multiple diseases. We use a total of 627,566 clinically collected data from cases of endometriosis (0.82%) and controls (99.18%) to construct and evaluate predictive models. We develop a machine learning platform to construct diagnostic tools for endometriosis. The platform consists of logistic regression, decision tree, random forest, AdaBoost, and XGBoost for prediction, and uses Shapley Additive Explanation (SHAP) values to quantify the importance of features. In the model selection phase, the constructed XGBoost model performs better than other algorithms while achieving an area under the curve (AUC) of 0.725 on the test set during the evaluation phase, resulting in a specificity of 62.9% and a sensitivity of 68.6%. The model leads to a quite low positive predictive value of 1.5%, but a quite satisfactory negative predictive value of 99.58%. Moreover, the feature importance analysis points to age, infertility, uterine fibroids, anxiety, and allergic rhinitis as the top five most important features for predicting endometriosis. Although these results show the feasibility of using machine learning to improve the diagnosis of endometriosis, more research is required to improve the performance of predictive models for the diagnosis of endometriosis. This state of affairs is in part attributed to the complex nature of the condition and, at the same time, the administrative nature of our features. Should more informative features be used, we could possibly achieve a higher AUC for predicting endometriosis. As a result, we merely perceive the constructed predictive model as a tool to provide <i>auxiliary</i> information in clinical practice.
first_indexed 2024-03-09T17:00:04Z
format Article
id doaj.art-c5430d27e5a04c6f938e8795ae5b2899
institution Directory Open Access Journal
issn 2227-9059
language English
last_indexed 2024-03-09T17:00:04Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Biomedicines
spelling doaj.art-c5430d27e5a04c6f938e8795ae5b28992023-11-24T14:31:10ZengMDPI AGBiomedicines2227-90592023-11-011111301510.3390/biomedicines11113015Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning ApproachUlan Tore0Aibek Abilgazym1Angel Asunsolo-del-Barco2Milan Terzic3Yerden Yemenkhan4Amin Zollanvari5Antonio Sarria-Santamera6School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanDepartment of Surgery, Medical and Social Sciences, Faculty of Medicine, University of Alcalá, 288871 Alcalá de Henares, SpainDepartment of Surgery, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanDepartment of Medicine, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanDepartment of Biomedical Sciences, School of Medicine, Nazarbayev University, Astana 010000, KazakhstanEndometriosis is defined as the presence of estrogen-dependent endometrial-like tissue outside the uterine cavity. Despite extensive research, endometriosis is still an enigmatic disease and is challenging to diagnose and treat. A common clinical finding is the association of endometriosis with multiple diseases. We use a total of 627,566 clinically collected data from cases of endometriosis (0.82%) and controls (99.18%) to construct and evaluate predictive models. We develop a machine learning platform to construct diagnostic tools for endometriosis. The platform consists of logistic regression, decision tree, random forest, AdaBoost, and XGBoost for prediction, and uses Shapley Additive Explanation (SHAP) values to quantify the importance of features. In the model selection phase, the constructed XGBoost model performs better than other algorithms while achieving an area under the curve (AUC) of 0.725 on the test set during the evaluation phase, resulting in a specificity of 62.9% and a sensitivity of 68.6%. The model leads to a quite low positive predictive value of 1.5%, but a quite satisfactory negative predictive value of 99.58%. Moreover, the feature importance analysis points to age, infertility, uterine fibroids, anxiety, and allergic rhinitis as the top five most important features for predicting endometriosis. Although these results show the feasibility of using machine learning to improve the diagnosis of endometriosis, more research is required to improve the performance of predictive models for the diagnosis of endometriosis. This state of affairs is in part attributed to the complex nature of the condition and, at the same time, the administrative nature of our features. Should more informative features be used, we could possibly achieve a higher AUC for predicting endometriosis. As a result, we merely perceive the constructed predictive model as a tool to provide <i>auxiliary</i> information in clinical practice.https://www.mdpi.com/2227-9059/11/11/3015endometriosiscomorbiditiesmachine learningXGBoostAdaBoostrandom forest
spellingShingle Ulan Tore
Aibek Abilgazym
Angel Asunsolo-del-Barco
Milan Terzic
Yerden Yemenkhan
Amin Zollanvari
Antonio Sarria-Santamera
Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
Biomedicines
endometriosis
comorbidities
machine learning
XGBoost
AdaBoost
random forest
title Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
title_full Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
title_fullStr Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
title_full_unstemmed Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
title_short Diagnosis of Endometriosis Based on Comorbidities: A Machine Learning Approach
title_sort diagnosis of endometriosis based on comorbidities a machine learning approach
topic endometriosis
comorbidities
machine learning
XGBoost
AdaBoost
random forest
url https://www.mdpi.com/2227-9059/11/11/3015
work_keys_str_mv AT ulantore diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT aibekabilgazym diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT angelasunsolodelbarco diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT milanterzic diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT yerdenyemenkhan diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT aminzollanvari diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach
AT antoniosarriasantamera diagnosisofendometriosisbasedoncomorbiditiesamachinelearningapproach