Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application

Aim:The diagnosis of breast cancer can be accomplished using an algorithm or an early detection model of breast cancer risk via determining factors. In the present study, gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) models were applied...

Full description

Bibliographic Details
Main Authors: Sami Akbulut, Ipek Balikci Cicek, Cemil Colak
Format: Article
Language:English
Published: Galenos Yayinevi 2022-06-01
Series:Haseki Tıp Bülteni
Subjects:
Online Access: http://www.hasekidergisi.com/archives/archive-detail/article-preview/classification-of-breast-cancer-on-the-strength-of/52214
_version_ 1797919824822665216
author Sami Akbulut
Ipek Balikci Cicek
Cemil Colak
author_facet Sami Akbulut
Ipek Balikci Cicek
Cemil Colak
author_sort Sami Akbulut
collection DOAJ
description Aim:The diagnosis of breast cancer can be accomplished using an algorithm or an early detection model of breast cancer risk via determining factors. In the present study, gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) models were applied and their performances were compared.Methods:The open-access Breast Cancer Wisconsin Dataset, which includes 10 features of breast tumors and results from 569 patients, was used for this study. The GBM, XGBoost, and LightGBM models for classifying breast cancer were established by a repeated stratified K-fold cross validation method. The performance of the model was evaluated with accuracy, recall, precision, and area under the curve (AUC).Results:Accuracy, recall, AUC, and precision values obtained from the GBM, XGBoost, and LightGBM models were as follows: (93.9%, 93.5%, 0.984, 93.8%), (94.6%, 94%, 0.985, 94.6%), and (95.3%, 94.8%, 0.987, 95.5%), respectively. According to these results, the best performance metrics were obtained from the LightGBM model. When the effects of the variables in the dataset on breast cancer were assessed in this study, the five most significant factors for the LightGBM model were the mean of concave points, texture mean, concavity mean, radius mean, and perimeter mean, respectively.Conclusion:According to the findings obtained from the study, the LightGBM model gave more successful predictions for breast cancer classification compared with other models. Unlike similar studies examining the same dataset, this study presented variable significance for breast cancer-related variables. Applying the LightGBM approach in the medical field can help doctors make a quick and precise diagnosis.
first_indexed 2024-04-10T13:52:43Z
format Article
id doaj.art-017dbdebab6e4e58ac4777119f9d4d2b
institution Directory Open Access Journal
issn 1302-0072
2147-2688
language English
last_indexed 2024-04-10T13:52:43Z
publishDate 2022-06-01
publisher Galenos Yayinevi
record_format Article
series Haseki Tıp Bülteni
spelling doaj.art-017dbdebab6e4e58ac4777119f9d4d2b2023-02-15T16:10:38ZengGalenos YayineviHaseki Tıp Bülteni1302-00722147-26882022-06-0160319620310.4274/haseki.galenos.2022.844013049054Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics ApplicationSami Akbulut0Ipek Balikci Cicek1Cemil Colak2 Inonu University Faculty of Medicine, Department of General Surgery, Malatya, Turkey Inonu University Faculty of Medicine, Department of Biostatistics and Medical Informatics, Malatya, Turkey Inonu University Faculty of Medicine, Department of Biostatistics and Medical Informatics, Malatya, Turkey Aim:The diagnosis of breast cancer can be accomplished using an algorithm or an early detection model of breast cancer risk via determining factors. In the present study, gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) models were applied and their performances were compared.Methods:The open-access Breast Cancer Wisconsin Dataset, which includes 10 features of breast tumors and results from 569 patients, was used for this study. The GBM, XGBoost, and LightGBM models for classifying breast cancer were established by a repeated stratified K-fold cross validation method. The performance of the model was evaluated with accuracy, recall, precision, and area under the curve (AUC).Results:Accuracy, recall, AUC, and precision values obtained from the GBM, XGBoost, and LightGBM models were as follows: (93.9%, 93.5%, 0.984, 93.8%), (94.6%, 94%, 0.985, 94.6%), and (95.3%, 94.8%, 0.987, 95.5%), respectively. According to these results, the best performance metrics were obtained from the LightGBM model. When the effects of the variables in the dataset on breast cancer were assessed in this study, the five most significant factors for the LightGBM model were the mean of concave points, texture mean, concavity mean, radius mean, and perimeter mean, respectively.Conclusion:According to the findings obtained from the study, the LightGBM model gave more successful predictions for breast cancer classification compared with other models. Unlike similar studies examining the same dataset, this study presented variable significance for breast cancer-related variables. Applying the LightGBM approach in the medical field can help doctors make a quick and precise diagnosis. http://www.hasekidergisi.com/archives/archive-detail/article-preview/classification-of-breast-cancer-on-the-strength-of/52214 breast cancerboosting algorithmgradient boosting algorithmxgboost algorithmlightgbm algorithm
spellingShingle Sami Akbulut
Ipek Balikci Cicek
Cemil Colak
Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
Haseki Tıp Bülteni
breast cancer
boosting algorithm
gradient boosting algorithm
xgboost algorithm
lightgbm algorithm
title Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
title_full Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
title_fullStr Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
title_full_unstemmed Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
title_short Classification of Breast Cancer on the Strength of Potential Risk Factors with Boosting Models: A Public Health Informatics Application
title_sort classification of breast cancer on the strength of potential risk factors with boosting models a public health informatics application
topic breast cancer
boosting algorithm
gradient boosting algorithm
xgboost algorithm
lightgbm algorithm
url http://www.hasekidergisi.com/archives/archive-detail/article-preview/classification-of-breast-cancer-on-the-strength-of/52214
work_keys_str_mv AT samiakbulut classificationofbreastcanceronthestrengthofpotentialriskfactorswithboostingmodelsapublichealthinformaticsapplication
AT ipekbalikcicicek classificationofbreastcanceronthestrengthofpotentialriskfactorswithboostingmodelsapublichealthinformaticsapplication
AT cemilcolak classificationofbreastcanceronthestrengthofpotentialriskfactorswithboostingmodelsapublichealthinformaticsapplication