A Detection of Type2 Diabetes using C4.5 Decision Tree

Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the dise...

Full description

Bibliographic Details
Main Author: Hamed Sabbagh Gol
Format: Article
Language:fas
Published: Kerman University of Medical Sciences 2018-09-01
Series:مجله انفورماتیک سلامت و زیست پزشکی
Subjects:
Online Access:http://jhbmi.ir/article-1-281-en.html
_version_ 1811176266531864576
author Hamed Sabbagh Gol
author_facet Hamed Sabbagh Gol
author_sort Hamed Sabbagh Gol
collection DOAJ
description Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the disease. In this study, through using decision tree C4.5, methods of prevention and treatment of diabetes were investigated. Methods: In this applied and descriptive study, we used the standard UCI data and the pima-Indians-diabetes data set. This database contains 768 records with 8 fields. The analysis was done using Weka software using the CRISP3 methodology. In modeling decision tree, C4.5 was created using input variables and determining target variables. Also, the sensitivity, specificity, accuracy, as well as positive and negative predictive values were used to evaluate the model. Results: According to the model, high blood sugar levels, high gravidity, high age, high diastolic blood pressure, familial history and high BMI have respectively the highest effects on type 2 diabetes mellitus. The ranking rate was 73.8% and the accuracy of the C4.5 algorithm was 79%. Conclusion: Compared to the results of studies in the field of data mining for diabetes, the accuracy of the proposed algorithm is acceptable. The most effective factors on diabetes were identified. Also, rules were developed that can be used as a model to predict the risk of diabetes in people.
first_indexed 2024-04-10T19:49:14Z
format Article
id doaj.art-25022d736c984430a4f47d9820d3d52d
institution Directory Open Access Journal
issn 2423-3870
2423-3498
language fas
last_indexed 2024-04-10T19:49:14Z
publishDate 2018-09-01
publisher Kerman University of Medical Sciences
record_format Article
series مجله انفورماتیک سلامت و زیست پزشکی
spelling doaj.art-25022d736c984430a4f47d9820d3d52d2023-01-28T10:41:54ZfasKerman University of Medical Sciencesمجله انفورماتیک سلامت و زیست پزشکی2423-38702423-34982018-09-0152293303A Detection of Type2 Diabetes using C4.5 Decision TreeHamed Sabbagh Gol0 M.Sc in Computer Engineering, Faculty of Computer, Department of Computer Engineering, Payame Noor University (PNU), Iran Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the disease. In this study, through using decision tree C4.5, methods of prevention and treatment of diabetes were investigated. Methods: In this applied and descriptive study, we used the standard UCI data and the pima-Indians-diabetes data set. This database contains 768 records with 8 fields. The analysis was done using Weka software using the CRISP3 methodology. In modeling decision tree, C4.5 was created using input variables and determining target variables. Also, the sensitivity, specificity, accuracy, as well as positive and negative predictive values were used to evaluate the model. Results: According to the model, high blood sugar levels, high gravidity, high age, high diastolic blood pressure, familial history and high BMI have respectively the highest effects on type 2 diabetes mellitus. The ranking rate was 73.8% and the accuracy of the C4.5 algorithm was 79%. Conclusion: Compared to the results of studies in the field of data mining for diabetes, the accuracy of the proposed algorithm is acceptable. The most effective factors on diabetes were identified. Also, rules were developed that can be used as a model to predict the risk of diabetes in people.http://jhbmi.ir/article-1-281-en.htmldata miningtype2 diabetesc4.5 decision tree
spellingShingle Hamed Sabbagh Gol
A Detection of Type2 Diabetes using C4.5 Decision Tree
مجله انفورماتیک سلامت و زیست پزشکی
data mining
type2 diabetes
c4.5 decision tree
title A Detection of Type2 Diabetes using C4.5 Decision Tree
title_full A Detection of Type2 Diabetes using C4.5 Decision Tree
title_fullStr A Detection of Type2 Diabetes using C4.5 Decision Tree
title_full_unstemmed A Detection of Type2 Diabetes using C4.5 Decision Tree
title_short A Detection of Type2 Diabetes using C4.5 Decision Tree
title_sort detection of type2 diabetes using c4 5 decision tree
topic data mining
type2 diabetes
c4.5 decision tree
url http://jhbmi.ir/article-1-281-en.html
work_keys_str_mv AT hamedsabbaghgol adetectionoftype2diabetesusingc45decisiontree
AT hamedsabbaghgol detectionoftype2diabetesusingc45decisiontree