Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method

Developing a prediction model from risk factors can provide an efficient method to recognize breast cancer. Machine learning (ML) algorithms have been applied to increase the efficiency of diagnosis at the early stage. This paper studies a support vector machine (SVM) combined with an extremely rand...

Full description

Bibliographic Details
Main Authors: Alfian, Ganjar, Syafrudin, Muhammad, Fahrurrozi, Imam, Fitriyani, Norma Latif, Atmaji, Fransiskus Tatas Dwi, Widodo, Tri, Bahiyah, Nurul, Benes, Filip, Rhee, Jongtae
Format: Article
Language:English
Published: MDPI 2022
Subjects:
Online Access:https://repository.ugm.ac.id/282964/1/computers-11-00136-v2.pdf
_version_ 1826050563638296576
author Alfian, Ganjar
Syafrudin, Muhammad
Fahrurrozi, Imam
Fitriyani, Norma Latif
Atmaji, Fransiskus Tatas Dwi
Widodo, Tri
Bahiyah, Nurul
Benes, Filip
Rhee, Jongtae
author_facet Alfian, Ganjar
Syafrudin, Muhammad
Fahrurrozi, Imam
Fitriyani, Norma Latif
Atmaji, Fransiskus Tatas Dwi
Widodo, Tri
Bahiyah, Nurul
Benes, Filip
Rhee, Jongtae
author_sort Alfian, Ganjar
collection UGM
description Developing a prediction model from risk factors can provide an efficient method to recognize breast cancer. Machine learning (ML) algorithms have been applied to increase the efficiency of diagnosis at the early stage. This paper studies a support vector machine (SVM) combined with an extremely randomized trees classifier (extra-trees) to provide a diagnosis of breast cancer at the early stage based on risk factors. The extra-trees classifier was used to remove irrelevant features, while SVM was utilized to diagnose the breast cancer status. A breast cancer dataset consisting of 116 subjects was utilized by machine learning models to predict breast cancer, while the stratified 10-fold cross-validation was employed for the model evaluation. Our proposed combined SVM and extra-trees model reached the highest accuracy up to 80.23, which was significantly better than the other ML model. The experimental results demonstrated that by applying extra-trees-based feature selection, the average ML prediction accuracy was improved by up to 7.29 as contrasted to ML without the feature selection method. Our proposed model is expected to increase the efficiency of breast cancer diagnosis based on risk factors. In addition, we presented the proposed prediction model that could be employed for web-based breast cancer prediction. The proposed model is expected to improve diagnostic decision-support systems by predicting breast cancer disease accurately.
first_indexed 2024-03-14T00:05:59Z
format Article
id oai:generic.eprints.org:282964
institution Universiti Gadjah Mada
language English
last_indexed 2024-03-14T00:05:59Z
publishDate 2022
publisher MDPI
record_format dspace
spelling oai:generic.eprints.org:2829642023-11-17T03:04:06Z https://repository.ugm.ac.id/282964/ Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method Alfian, Ganjar Syafrudin, Muhammad Fahrurrozi, Imam Fitriyani, Norma Latif Atmaji, Fransiskus Tatas Dwi Widodo, Tri Bahiyah, Nurul Benes, Filip Rhee, Jongtae Cancer Diagnosis Developing a prediction model from risk factors can provide an efficient method to recognize breast cancer. Machine learning (ML) algorithms have been applied to increase the efficiency of diagnosis at the early stage. This paper studies a support vector machine (SVM) combined with an extremely randomized trees classifier (extra-trees) to provide a diagnosis of breast cancer at the early stage based on risk factors. The extra-trees classifier was used to remove irrelevant features, while SVM was utilized to diagnose the breast cancer status. A breast cancer dataset consisting of 116 subjects was utilized by machine learning models to predict breast cancer, while the stratified 10-fold cross-validation was employed for the model evaluation. Our proposed combined SVM and extra-trees model reached the highest accuracy up to 80.23, which was significantly better than the other ML model. The experimental results demonstrated that by applying extra-trees-based feature selection, the average ML prediction accuracy was improved by up to 7.29 as contrasted to ML without the feature selection method. Our proposed model is expected to increase the efficiency of breast cancer diagnosis based on risk factors. In addition, we presented the proposed prediction model that could be employed for web-based breast cancer prediction. The proposed model is expected to improve diagnostic decision-support systems by predicting breast cancer disease accurately. MDPI 2022 Article PeerReviewed application/pdf en https://repository.ugm.ac.id/282964/1/computers-11-00136-v2.pdf Alfian, Ganjar and Syafrudin, Muhammad and Fahrurrozi, Imam and Fitriyani, Norma Latif and Atmaji, Fransiskus Tatas Dwi and Widodo, Tri and Bahiyah, Nurul and Benes, Filip and Rhee, Jongtae (2022) Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method. Computers, 11 (9). ISSN 2073431X
spellingShingle Cancer Diagnosis
Alfian, Ganjar
Syafrudin, Muhammad
Fahrurrozi, Imam
Fitriyani, Norma Latif
Atmaji, Fransiskus Tatas Dwi
Widodo, Tri
Bahiyah, Nurul
Benes, Filip
Rhee, Jongtae
Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title_full Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title_fullStr Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title_full_unstemmed Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title_short Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method
title_sort predicting breast cancer from risk factors using svm and extra trees based feature selection method
topic Cancer Diagnosis
url https://repository.ugm.ac.id/282964/1/computers-11-00136-v2.pdf
work_keys_str_mv AT alfianganjar predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT syafrudinmuhammad predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT fahrurroziimam predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT fitriyaninormalatif predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT atmajifransiskustatasdwi predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT widodotri predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT bahiyahnurul predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT benesfilip predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod
AT rheejongtae predictingbreastcancerfromriskfactorsusingsvmandextratreesbasedfeatureselectionmethod