Classification models combined with Boruta feature selection for heart disease prediction
Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-01-01
|
Series: | Informatics in Medicine Unlocked |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352914823002885 |
_version_ | 1797348632754651136 |
---|---|
author | G. Manikandan B. Pragadeesh V. Manojkumar A.L. Karthikeyan R. Manikandan Amir H. Gandomi |
author_facet | G. Manikandan B. Pragadeesh V. Manojkumar A.L. Karthikeyan R. Manikandan Amir H. Gandomi |
author_sort | G. Manikandan |
collection | DOAJ |
description | Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most vulnerable to heart diseases and ensuring they receive the appropriate care, premature demise can be prevented. Machine learning algorithms are now crucial in the medical field, especially when using medical databases to diagnose diseases. Such efficient algorithms and data processing techniques are applied to predict various diseases and offer much potential for accurate heart disease prognosis. Therefore, this study compares the performance logistic regression, decision tree, and support vector machine (SVM) methods with and without Boruta feature selection. The Cleveland Clinic Heart Disease Dataset acquired from Kaggle, which consists of 14 features and 303 instances, was used for the investigation. It was found that the Boruta feature selection algorithm, which selects six of the most relevant features, improved the results of the algorithms. Among these classification algorithms, logistic regression produced the most efficient result, with an accuracy of 88.52 %. |
first_indexed | 2024-03-08T12:08:34Z |
format | Article |
id | doaj.art-4b5970b3ad4f434e9b9e1d6550ef4c2a |
institution | Directory Open Access Journal |
issn | 2352-9148 |
language | English |
last_indexed | 2024-03-08T12:08:34Z |
publishDate | 2024-01-01 |
publisher | Elsevier |
record_format | Article |
series | Informatics in Medicine Unlocked |
spelling | doaj.art-4b5970b3ad4f434e9b9e1d6550ef4c2a2024-01-23T04:15:52ZengElsevierInformatics in Medicine Unlocked2352-91482024-01-0144101442Classification models combined with Boruta feature selection for heart disease predictionG. Manikandan0B. Pragadeesh1V. Manojkumar2A.L. Karthikeyan3R. Manikandan4Amir H. Gandomi5School of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaSchool of Computing, SASTRA Deemed University, Thanjavur, Tamil Nadu, IndiaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NSW, Australia; University Research and Innovation Center (EKIK), Óbuda University, 1034, Budapest, Hungary; Corresponding author. Faculty of Engineering & Information Systems, University of Technology Sydney, Sydney, Australia.Cardiovascular disease (CVD), generally called heart illness, is a collective term for various ailments that affect the heart and blood vessels. Heart disease is a primary cause of fatality and morbidity in people worldwide, resulting in 18 million deaths per year. By identifying those who are most vulnerable to heart diseases and ensuring they receive the appropriate care, premature demise can be prevented. Machine learning algorithms are now crucial in the medical field, especially when using medical databases to diagnose diseases. Such efficient algorithms and data processing techniques are applied to predict various diseases and offer much potential for accurate heart disease prognosis. Therefore, this study compares the performance logistic regression, decision tree, and support vector machine (SVM) methods with and without Boruta feature selection. The Cleveland Clinic Heart Disease Dataset acquired from Kaggle, which consists of 14 features and 303 instances, was used for the investigation. It was found that the Boruta feature selection algorithm, which selects six of the most relevant features, improved the results of the algorithms. Among these classification algorithms, logistic regression produced the most efficient result, with an accuracy of 88.52 %.http://www.sciencedirect.com/science/article/pii/S2352914823002885Decision treeLogistic regressionSupport Vector MachineBorutaHeart diseaseFeature selection |
spellingShingle | G. Manikandan B. Pragadeesh V. Manojkumar A.L. Karthikeyan R. Manikandan Amir H. Gandomi Classification models combined with Boruta feature selection for heart disease prediction Informatics in Medicine Unlocked Decision tree Logistic regression Support Vector Machine Boruta Heart disease Feature selection |
title | Classification models combined with Boruta feature selection for heart disease prediction |
title_full | Classification models combined with Boruta feature selection for heart disease prediction |
title_fullStr | Classification models combined with Boruta feature selection for heart disease prediction |
title_full_unstemmed | Classification models combined with Boruta feature selection for heart disease prediction |
title_short | Classification models combined with Boruta feature selection for heart disease prediction |
title_sort | classification models combined with boruta feature selection for heart disease prediction |
topic | Decision tree Logistic regression Support Vector Machine Boruta Heart disease Feature selection |
url | http://www.sciencedirect.com/science/article/pii/S2352914823002885 |
work_keys_str_mv | AT gmanikandan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction AT bpragadeesh classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction AT vmanojkumar classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction AT alkarthikeyan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction AT rmanikandan classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction AT amirhgandomi classificationmodelscombinedwithborutafeatureselectionforheartdiseaseprediction |