Classification models combined with Boruta feature selection for heart disease prediction

Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most...

Full description

Bibliographic Details
Main Authors: G. Manikandan, B. Pragadeesh, V. Manojkumar, A.L. Karthikeyan, R. Manikandan, Amir H. Gandomi
Format: Article
Language:English
Published: Elsevier 2024-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914823002885
_version_ 1797348632754651136
author G. Manikandan
B. Pragadeesh
V. Manojkumar
A.L. Karthikeyan
R. Manikandan
Amir H. Gandomi
author_facet G. Manikandan
B. Pragadeesh
V. Manojkumar
A.L. Karthikeyan
R. Manikandan
Amir H. Gandomi
author_sort G. Manikandan
collection DOAJ
description Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most vulnerable to heart diseases and ensuring they receive the appropriate care, premature demise can be prevented. Machine learning algorithms are now crucial in the medical field, especially when using medical databases to diagnose diseases. Such efficient algorithms and data processing techniques are applied to predict various diseases and offer much potential for accurate heart disease prognosis. Therefore, this study compares the performance logistic regression, decision tree, and support vector machine (SVM) methods with and without Boruta feature selection. The Cleveland Clinic Heart Disease Dataset acquired from Kaggle, which consists of 14 features and 303 instances, was used for the investigation. It was found that the Boruta feature selection algorithm, which selects six of the most relevant features, improved the results of the algorithms. Among these classification algorithms, logistic regression produced the most efficient result, with an accuracy of 88.52 %.
first_indexed 2024-03-08T12:08:34Z
format Article
id doaj.art-4b5970b3ad4f434e9b9e1d6550ef4c2a
institution Directory Open Access Journal
issn 2352-9148
language English
last_indexed 2024-03-08T12:08:34Z
publishDate 2024-01-01
publisher Elsevier
record_format Article
series Informatics in Medicine Unlocked
spelling doaj.art-4b5970b3ad4f434e9b9e1d6550ef4c2a2024-01-23T04:15:52ZengElsevierInformatics in Medicine Unlocked2352-91482024-01-0144101442Classification models combined with Boruta feature selection for heart disease predictionG. Manikandan0B. Pragadeesh1V. Manojkumar2A.L. Karthikeyan3R. Manikandan4Amir H. Gandomi5School of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NSW, Australia; University Research and Innovation Center (EKIK), Óbuda University, 1034, Budapest, Hungary; Corresponding author. Faculty of Engineering & Information Systems, University of Technology Sydney, Sydney, Australia.Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most vulnerable to heart diseases and ensuring they receive the appropriate care, premature demise can be prevented. Machine learning algorithms are now crucial in the medical field, especially when using medical databases to diagnose diseases. Such efficient algorithms and data processing techniques are applied to predict various diseases and offer much potential for accurate heart disease prognosis. Therefore, this study compares the performance logistic regression, decision tree, and support vector machine (SVM) methods with and without Boruta feature selection. The Cleveland Clinic Heart Disease Dataset acquired from Kaggle, which consists of 14 features and 303 instances, was used for the investigation. It was found that the Boruta feature selection algorithm, which selects six of the most relevant features, improved the results of the algorithms. Among these classification algorithms, logistic regression produced the most efficient result, with an accuracy of 88.52 %.http://www.sciencedirect.com/science/article/pii/S2352914823002885Decision treeLogistic regressionSupport Vector MachineBorutaHeart diseaseFeature selection
spellingShingle G. Manikandan
B. Pragadeesh
V. Manojkumar
A.L. Karthikeyan
R. Manikandan
Amir H. Gandomi
Classification models combined with Boruta feature selection for heart disease prediction
Informatics in Medicine Unlocked
Decision tree
Logistic regression
Support Vector Machine
Boruta
Heart disease
Feature selection
title Classification models combined with Boruta feature selection for heart disease prediction
title_full Classification models combined with Boruta feature selection for heart disease prediction
title_fullStr Classification models combined with Boruta feature selection for heart disease prediction
title_full_unstemmed Classification models combined with Boruta feature selection for heart disease prediction
title_short Classification models combined with Boruta feature selection for heart disease prediction
title_sort classification models combined with boruta feature selection for heart disease prediction
topic Decision tree
Logistic regression
Support Vector Machine
Boruta
Heart disease
Feature selection
url http://www.sciencedirect.com/science/article/pii/S2352914823002885
work_keys_str_mv AT gmanikandan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction
AT bpragadeesh classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction
AT vmanojkumar classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction
AT alkarthikeyan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction
AT rmanikandan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction
AT amirhgandomi classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction