Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease
Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for t...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hamara Afzar
2023-04-01
|
Series: | Frontiers in Health Informatics |
Subjects: | |
Online Access: | http://ijmi.ir/index.php/IJMI/article/view/402 |
_version_ | 1797371001767460864 |
---|---|
author | Sajad Yousefi Maryam Poornajaf |
author_facet | Sajad Yousefi Maryam Poornajaf |
author_sort | Sajad Yousefi |
collection | DOAJ |
description | Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for the large amount of people early. Using machine learning models performed on dataset. This article aims to find the most efficient and accurate machine learning models for disease prediction.
Material and Methods: Several supervised machine learning algorithms were utilized to diagnosis and prediction of heart disease such as logistic regression, decision tree, random forest and KNN. The algorithms are applied to a dataset taken from the Kaggle site including 70000 samples. In algorithms, methods such as the importance of features, hold out validation, 10-fold cross-validation, stratified 10-fold cross-validation, leave one out cross-validation are the result of effective performance and increase accuracy. In addition, feature importance scores was estimated for each feature in some algorithms. These features were ranked based on feature importance score. All the work is done in the Anaconda environment based on python programming language and Scikit-learn library.
Results: The algorithms performance is compared to each other so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, random forest algorithm with F1 score 92%, accuracy 92% and AUC ROC 95%, has better performance than other algorithms.
Conclusion: The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the diagnosis and prediction of heart disease is compared to determine the most appropriate classifier. |
first_indexed | 2024-03-08T18:13:34Z |
format | Article |
id | doaj.art-4505cd522ca34a59bbe9be5031d04e72 |
institution | Directory Open Access Journal |
issn | 2676-7104 |
language | English |
last_indexed | 2024-03-08T18:13:34Z |
publishDate | 2023-04-01 |
publisher | Hamara Afzar |
record_format | Article |
series | Frontiers in Health Informatics |
spelling | doaj.art-4505cd522ca34a59bbe9be5031d04e722023-12-31T18:36:20ZengHamara AfzarFrontiers in Health Informatics2676-71042023-04-0112010.30699/fhi.v12i0.402223Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart DiseaseSajad Yousefi0Maryam Poornajaf1Faculty Member, Department of Electrical Engineering, Technical and Vocational University (TVU), Tehran,Faculty Member, Department of Computer Engineering, Technical and Vocational University (TVU), Tehran,Introduction: Heart disease is, for the most part, alluding to conditions that include limited or blocked veins that can prompt a heart attack, chest torment or stroke. Earlier identification of heart disease may reduce the death rate. The cost of medical diagnosis makes it perverse to cure it for the large amount of people early. Using machine learning models performed on dataset. This article aims to find the most efficient and accurate machine learning models for disease prediction. Material and Methods: Several supervised machine learning algorithms were utilized to diagnosis and prediction of heart disease such as logistic regression, decision tree, random forest and KNN. The algorithms are applied to a dataset taken from the Kaggle site including 70000 samples. In algorithms, methods such as the importance of features, hold out validation, 10-fold cross-validation, stratified 10-fold cross-validation, leave one out cross-validation are the result of effective performance and increase accuracy. In addition, feature importance scores was estimated for each feature in some algorithms. These features were ranked based on feature importance score. All the work is done in the Anaconda environment based on python programming language and Scikit-learn library. Results: The algorithms performance is compared to each other so that performance based on ROC curve and some criteria such as accuracy, precision, sensitivity and F1 score were evaluated for each model. As a result of evaluation, random forest algorithm with F1 score 92%, accuracy 92% and AUC ROC 95%, has better performance than other algorithms. Conclusion: The area under the ROC curve and evaluating criteria related to a number of classifying algorithms of machine learning to evaluate heart disease and indeed, the diagnosis and prediction of heart disease is compared to determine the most appropriate classifier.http://ijmi.ir/index.php/IJMI/article/view/402f1-scoremachine learningheart diseaseclassificationimportance scoreaccuracy |
spellingShingle | Sajad Yousefi Maryam Poornajaf Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease Frontiers in Health Informatics f1-score machine learning heart disease classification importance score accuracy |
title | Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease |
title_full | Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease |
title_fullStr | Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease |
title_full_unstemmed | Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease |
title_short | Analysis of Accuracy Metric of Machine Learning Algorithms in Predicting Heart Disease |
title_sort | analysis of accuracy metric of machine learning algorithms in predicting heart disease |
topic | f1-score machine learning heart disease classification importance score accuracy |
url | http://ijmi.ir/index.php/IJMI/article/view/402 |
work_keys_str_mv | AT sajadyousefi analysisofaccuracymetricofmachinelearningalgorithmsinpredictingheartdisease AT maryampoornajaf analysisofaccuracymetricofmachinelearningalgorithmsinpredictingheartdisease |