Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease

Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for t...

Full description

Bibliographic Details
Main Authors: Sajad Yousefi, Maryam Poornajaf
Format: Article
Language:English
Published: Hamara Afzar 2023-04-01
Series:Frontiers in Health Informatics
Subjects:
Online Access:http://ijmi.ir/index.php/IJMI/article/view/402
_version_ 1797371001767460864
author Sajad Yousefi
Maryam Poornajaf
author_facet Sajad Yousefi
Maryam Poornajaf
author_sort Sajad Yousefi
collection DOAJ
description Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for the large amount of people early. Using machine learning models performed on dataset. This article aims to find the most efficient and accurate machine learning models for disease prediction. Material and Methods: Several supervised machine learning algorithms were utilized to diagnosis and prediction of heart disease such as logistic regression, decision tree, random forest and KNN. The algorithms are applied to a dataset taken from the Kaggle site including 70000 samples.  In algorithms, methods such as the importance of features, hold out validation, 10-fold cross-validation, stratified 10-fold cross-validation, leave one out cross-validation are the result of effective performance and increase accuracy. In addition, feature importance scores was estimated for each feature in some algorithms. These features were ranked based on feature importance score. All the work is done in the Anaconda environment based on python programming language and Scikit-learn library. Results: The algorithms performance is compared to each other so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, random forest algorithm with F1 score 92%, accuracy 92% and AUC ROC 95%, has better performance than other algorithms. Conclusion: The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the diagnosis and prediction of heart disease is compared to determine the most appropriate classifier.
first_indexed 2024-03-08T18:13:34Z
format Article
id doaj.art-4505cd522ca34a59bbe9be5031d04e72
institution Directory Open Access Journal
issn 2676-7104
language English
last_indexed 2024-03-08T18:13:34Z
publishDate 2023-04-01
publisher Hamara Afzar
record_format Article
series Frontiers in Health Informatics
spelling doaj.art-4505cd522ca34a59bbe9be5031d04e722023-12-31T18:36:20ZengHamara AfzarFrontiers in Health Informatics2676-71042023-04-0112010.30699/fhi.v12i0.402223Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart DiseaseSajad Yousefi0Maryam Poornajaf1Faculty Member, Department of Electrical Engineering, Technical and Vocational University (TVU), Tehran,Faculty Member, Department of Computer Engineering, Technical and Vocational University (TVU), Tehran,Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for the large amount of people early. Using machine learning models performed on dataset. This article aims to find the most efficient and accurate machine learning models for disease prediction. Material and Methods: Several supervised machine learning algorithms were utilized to diagnosis and prediction of heart disease such as logistic regression, decision tree, random forest and KNN. The algorithms are applied to a dataset taken from the Kaggle site including 70000 samples.  In algorithms, methods such as the importance of features, hold out validation, 10-fold cross-validation, stratified 10-fold cross-validation, leave one out cross-validation are the result of effective performance and increase accuracy. In addition, feature importance scores was estimated for each feature in some algorithms. These features were ranked based on feature importance score. All the work is done in the Anaconda environment based on python programming language and Scikit-learn library. Results: The algorithms performance is compared to each other so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, random forest algorithm with F1 score 92%, accuracy 92% and AUC ROC 95%, has better performance than other algorithms. Conclusion: The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the diagnosis and prediction of heart disease is compared to determine the most appropriate classifier.http://ijmi.ir/index.php/IJMI/article/view/402f1-scoremachine learningheart diseaseclassificationimportance scoreaccuracy
spellingShingle Sajad Yousefi
Maryam Poornajaf
Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
Frontiers in Health Informatics
f1-score
machine learning
heart disease
classification
importance score
accuracy
title Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
title_full Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
title_fullStr Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
title_full_unstemmed Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
title_short Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
title_sort analysis of accuracy metric of machine learning algorithms in predicting heart disease
topic f1-score
machine learning
heart disease
classification
importance score
accuracy
url http://ijmi.ir/index.php/IJMI/article/view/402
work_keys_str_mv AT sajadyousefi analysisofaccuracymetricofmachinelearningalgorithmsinpredictingheartdisease
AT maryampoornajaf analysisofaccuracymetricofmachinelearningalgorithmsinpredictingheartdisease