Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models

Ovarian cancer is one of the most common types of gynecological malignancies with its high mortality rate, silent and occult tumor growth, late onset of symptoms and diagnosis in advanced stages. Therefore, the need to develop new diagnostic techniques to predict the course of the disease and the pr...

Full description

Bibliographic Details
Main Authors: Onural Ozhan, Zeynep Kucukakcali, Ipek Balikci Cicek
Format: Article
Language:English
Published: Society of Turaz Bilim 2023-03-01
Series:Medicine Science
Subjects:
Online Access:https://www.medicinescience.org/?mno=114710
_version_ 1797316440430215168
author Onural Ozhan
Zeynep Kucukakcali
Ipek Balikci Cicek
author_facet Onural Ozhan
Zeynep Kucukakcali
Ipek Balikci Cicek
author_sort Onural Ozhan
collection DOAJ
description Ovarian cancer is one of the most common types of gynecological malignancies with its high mortality rate, silent and occult tumor growth, late onset of symptoms and diagnosis in advanced stages. Therefore, the need to develop new diagnostic techniques to predict the course of the disease and the prognosis of this malignancy has increased. In this study, ovarian cancer and benign ovarian tumor samples will be classified to create an accurate diagnostic predictive model using the machine learning method XGBoost and Stochastic Gradient Boosting and disease-related risk factors will be determined. This current study considered the open-access ovarian cancer and benign ovarian tumor samples data set. For this purpose, data from 349 patients were included. The data set was divided as 80:20 as a training and test dataset. XGBoost and Stochastic Gradient Boosting were constructed for the classification via five-fold cross-validation. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, and negative predictive value performance metrics were evaluated for model performance. Among the performance criteria in the test stage obtained from the XGBoost model that has the best classification result; accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were obtained as 89.5%, 88.7%, 85.7%, 91.7%, 85.7%, 91.7%, and 85.7%, respectively. According to the variable importance obtained as a result of the model, the variables most associated with the diagnosis were CA72-4, HE4, LYM%, ALB, EO%, BUN, RBC, NEU, and MCV, respectively. The applied machine learning model successfully classified ovarian cancer and created a highly accurate diagnostic prediction model. The results from the study revealed effective parameters that can diagnose ovarian cancer with high accuracy. With the parameters determined as a result of the modeling, the clinician will be able to simplify and facilitate the decision-making process for the diagnosis of ovarian cancer. [Med-Science 2023; 12(1.000): 231-7]
first_indexed 2024-03-08T03:19:23Z
format Article
id doaj.art-2376186ae5f348b9a14f35b488fb6e3b
institution Directory Open Access Journal
issn 2147-0634
language English
last_indexed 2024-03-08T03:19:23Z
publishDate 2023-03-01
publisher Society of Turaz Bilim
record_format Article
series Medicine Science
spelling doaj.art-2376186ae5f348b9a14f35b488fb6e3b2024-02-12T10:34:08ZengSociety of Turaz BilimMedicine Science2147-06342023-03-01121231710.5455/medscience.2022.09.207114710Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting modelsOnural Ozhan0Zeynep Kucukakcali1Ipek Balikci CicekInonu University Faculty of Medicine, Department of Pharmacology Inonu University Faculty of Medicine, Department of Biostatistics and Medical Informatics Inonu University Faculty of Medicine, Department of Biostatistics and Medical InformaticsOvarian cancer is one of the most common types of gynecological malignancies with its high mortality rate, silent and occult tumor growth, late onset of symptoms and diagnosis in advanced stages. Therefore, the need to develop new diagnostic techniques to predict the course of the disease and the prognosis of this malignancy has increased. In this study, ovarian cancer and benign ovarian tumor samples will be classified to create an accurate diagnostic predictive model using the machine learning method XGBoost and Stochastic Gradient Boosting and disease-related risk factors will be determined. This current study considered the open-access ovarian cancer and benign ovarian tumor samples data set. For this purpose, data from 349 patients were included. The data set was divided as 80:20 as a training and test dataset. XGBoost and Stochastic Gradient Boosting were constructed for the classification via five-fold cross-validation. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, and negative predictive value performance metrics were evaluated for model performance. Among the performance criteria in the test stage obtained from the XGBoost model that has the best classification result; accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were obtained as 89.5%, 88.7%, 85.7%, 91.7%, 85.7%, 91.7%, and 85.7%, respectively. According to the variable importance obtained as a result of the model, the variables most associated with the diagnosis were CA72-4, HE4, LYM%, ALB, EO%, BUN, RBC, NEU, and MCV, respectively. The applied machine learning model successfully classified ovarian cancer and created a highly accurate diagnostic prediction model. The results from the study revealed effective parameters that can diagnose ovarian cancer with high accuracy. With the parameters determined as a result of the modeling, the clinician will be able to simplify and facilitate the decision-making process for the diagnosis of ovarian cancer. [Med-Science 2023; 12(1.000): 231-7]https://www.medicinescience.org/?mno=114710ovarian cancerclassificationmachine learningxgbooststochastic gradient boosting
spellingShingle Onural Ozhan
Zeynep Kucukakcali
Ipek Balikci Cicek
Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
Medicine Science
ovarian cancer
classification
machine learning
xgboost
stochastic gradient boosting
title Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
title_full Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
title_fullStr Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
title_full_unstemmed Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
title_short Machine learning-based ovarian cancer prediction with XGboost and stochastic gradient boosting models
title_sort machine learning based ovarian cancer prediction with xgboost and stochastic gradient boosting models
topic ovarian cancer
classification
machine learning
xgboost
stochastic gradient boosting
url https://www.medicinescience.org/?mno=114710
work_keys_str_mv AT onuralozhan machinelearningbasedovariancancerpredictionwithxgboostandstochasticgradientboostingmodels
AT zeynepkucukakcali machinelearningbasedovariancancerpredictionwithxgboostandstochasticgradientboostingmodels
AT ipekbalikcicicek machinelearningbasedovariancancerpredictionwithxgboostandstochasticgradientboostingmodels