Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting

Abstract Heart diseases are consistently ranked among the top causes of mortality on a global scale. Early detection and accurate heart disease prediction can help effectively manage and prevent the disease. However, the traditional methods have failed to improve heart disease classification perform...

Full description

Bibliographic Details
Main Authors: Anil Pandurang Jawalkar, Pandla Swetcha, Nuka Manasvi, Pakki Sreekala, Samudrala Aishwarya, Potru Kanaka Durga Bhavani, Pendem Anjani
Format: Article
Language:English
Published: SpringerOpen 2023-10-01
Series:Journal of Engineering and Applied Science
Subjects:
Online Access:https://doi.org/10.1186/s44147-023-00280-y
_version_ 1797559416508121088
author Anil Pandurang Jawalkar
Pandla Swetcha
Nuka Manasvi
Pakki Sreekala
Samudrala Aishwarya
Potru Kanaka Durga Bhavani
Pendem Anjani
author_facet Anil Pandurang Jawalkar
Pandla Swetcha
Nuka Manasvi
Pakki Sreekala
Samudrala Aishwarya
Potru Kanaka Durga Bhavani
Pendem Anjani
author_sort Anil Pandurang Jawalkar
collection DOAJ
description Abstract Heart diseases are consistently ranked among the top causes of mortality on a global scale. Early detection and accurate heart disease prediction can help effectively manage and prevent the disease. However, the traditional methods have failed to improve heart disease classification performance. So, this article proposes a machine learning approach for heart disease prediction (HDP) using a decision tree-based random forest (DTRF) classifier with loss optimization. Initially, preprocessing of the dataset with patient records with known labels is performed for the presence or absence of heart disease records. Then, train a DTRF classifier on the dataset using stochastic gradient boosting (SGB) loss optimization technique and evaluate the classifier’s performance using a separate test dataset. The results demonstrate that the proposed HDP-DTRF approach resulted in 86% of precision, 86% of recall, 85% of F1-score, and 96% of accuracy on publicly available real-world datasets, which are higher than traditional methods.
first_indexed 2024-03-10T17:45:09Z
format Article
id doaj.art-a2d4f3e162944bdfa8f18a483e1f3ae4
institution Directory Open Access Journal
issn 1110-1903
2536-9512
language English
last_indexed 2024-03-10T17:45:09Z
publishDate 2023-10-01
publisher SpringerOpen
record_format Article
series Journal of Engineering and Applied Science
spelling doaj.art-a2d4f3e162944bdfa8f18a483e1f3ae42023-11-20T09:33:29ZengSpringerOpenJournal of Engineering and Applied Science1110-19032536-95122023-10-0170111810.1186/s44147-023-00280-yEarly prediction of heart disease with data analysis using supervised learning with stochastic gradient boostingAnil Pandurang Jawalkar0Pandla Swetcha1Nuka Manasvi2Pakki Sreekala3Samudrala Aishwarya4Potru Kanaka Durga Bhavani5Pendem Anjani6Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Department of Information Technology, Malla Reddy Engineering College for Women (UGC-Autonomous)Abstract Heart diseases are consistently ranked among the top causes of mortality on a global scale. Early detection and accurate heart disease prediction can help effectively manage and prevent the disease. However, the traditional methods have failed to improve heart disease classification performance. So, this article proposes a machine learning approach for heart disease prediction (HDP) using a decision tree-based random forest (DTRF) classifier with loss optimization. Initially, preprocessing of the dataset with patient records with known labels is performed for the presence or absence of heart disease records. Then, train a DTRF classifier on the dataset using stochastic gradient boosting (SGB) loss optimization technique and evaluate the classifier’s performance using a separate test dataset. The results demonstrate that the proposed HDP-DTRF approach resulted in 86% of precision, 86% of recall, 85% of F1-score, and 96% of accuracy on publicly available real-world datasets, which are higher than traditional methods.https://doi.org/10.1186/s44147-023-00280-yHeart diseaseMachine learningDecision treeRandom forestStochastic gradient boostingLoss optimization
spellingShingle Anil Pandurang Jawalkar
Pandla Swetcha
Nuka Manasvi
Pakki Sreekala
Samudrala Aishwarya
Potru Kanaka Durga Bhavani
Pendem Anjani
Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
Journal of Engineering and Applied Science
Heart disease
Machine learning
Decision tree
Random forest
Stochastic gradient boosting
Loss optimization
title Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
title_full Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
title_fullStr Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
title_full_unstemmed Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
title_short Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
title_sort early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
topic Heart disease
Machine learning
Decision tree
Random forest
Stochastic gradient boosting
Loss optimization
url https://doi.org/10.1186/s44147-023-00280-y
work_keys_str_mv AT anilpandurangjawalkar earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT pandlaswetcha earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT nukamanasvi earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT pakkisreekala earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT samudralaaishwarya earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT potrukanakadurgabhavani earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting
AT pendemanjani earlypredictionofheartdiseasewithdataanalysisusingsupervisedlearningwithstochasticgradientboosting