A Detection of Type2 Diabetes using C4.5 Decision Tree
Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the dise...
Main Author: | |
---|---|
Format: | Article |
Language: | fas |
Published: |
Kerman University of Medical Sciences
2018-09-01
|
Series: | مجله انفورماتیک سلامت و زیست پزشکی |
Subjects: | |
Online Access: | http://jhbmi.ir/article-1-281-en.html |
_version_ | 1811176266531864576 |
---|---|
author | Hamed Sabbagh Gol |
author_facet | Hamed Sabbagh Gol |
author_sort | Hamed Sabbagh Gol |
collection | DOAJ |
description | Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the disease. In this study, through using decision tree C4.5, methods of prevention and treatment of diabetes were investigated.
Methods: In this applied and descriptive study, we used the standard UCI data and the pima-Indians-diabetes data set. This database contains 768 records with 8 fields. The analysis was done using Weka software using the CRISP3 methodology. In modeling decision tree, C4.5 was created using input variables and determining target variables. Also, the sensitivity, specificity, accuracy, as well as positive and negative predictive values were used to evaluate the model.
Results: According to the model, high blood sugar levels, high gravidity, high age, high diastolic blood pressure, familial history and high BMI have respectively the highest effects on type 2 diabetes mellitus. The ranking rate was 73.8% and the accuracy of the C4.5 algorithm was 79%.
Conclusion: Compared to the results of studies in the field of data mining for diabetes, the accuracy of the proposed algorithm is acceptable. The most effective factors on diabetes were identified. Also, rules were developed that can be used as a model to predict the risk of diabetes in people. |
first_indexed | 2024-04-10T19:49:14Z |
format | Article |
id | doaj.art-25022d736c984430a4f47d9820d3d52d |
institution | Directory Open Access Journal |
issn | 2423-3870 2423-3498 |
language | fas |
last_indexed | 2024-04-10T19:49:14Z |
publishDate | 2018-09-01 |
publisher | Kerman University of Medical Sciences |
record_format | Article |
series | مجله انفورماتیک سلامت و زیست پزشکی |
spelling | doaj.art-25022d736c984430a4f47d9820d3d52d2023-01-28T10:41:54ZfasKerman University of Medical Sciencesمجله انفورماتیک سلامت و زیست پزشکی2423-38702423-34982018-09-0152293303A Detection of Type2 Diabetes using C4.5 Decision TreeHamed Sabbagh Gol0 M.Sc in Computer Engineering, Faculty of Computer, Department of Computer Engineering, Payame Noor University (PNU), Iran Introduction: One of the most common diseases in the world is diabetes and the global prevalence of diabetes increases by about six percent annually. The use of data mining techniques to create predictive models is very helpful in identifying people at risk and reducing the complications of the disease. In this study, through using decision tree C4.5, methods of prevention and treatment of diabetes were investigated. Methods: In this applied and descriptive study, we used the standard UCI data and the pima-Indians-diabetes data set. This database contains 768 records with 8 fields. The analysis was done using Weka software using the CRISP3 methodology. In modeling decision tree, C4.5 was created using input variables and determining target variables. Also, the sensitivity, specificity, accuracy, as well as positive and negative predictive values were used to evaluate the model. Results: According to the model, high blood sugar levels, high gravidity, high age, high diastolic blood pressure, familial history and high BMI have respectively the highest effects on type 2 diabetes mellitus. The ranking rate was 73.8% and the accuracy of the C4.5 algorithm was 79%. Conclusion: Compared to the results of studies in the field of data mining for diabetes, the accuracy of the proposed algorithm is acceptable. The most effective factors on diabetes were identified. Also, rules were developed that can be used as a model to predict the risk of diabetes in people.http://jhbmi.ir/article-1-281-en.htmldata miningtype2 diabetesc4.5 decision tree |
spellingShingle | Hamed Sabbagh Gol A Detection of Type2 Diabetes using C4.5 Decision Tree مجله انفورماتیک سلامت و زیست پزشکی data mining type2 diabetes c4.5 decision tree |
title | A Detection of Type2 Diabetes using C4.5 Decision Tree |
title_full | A Detection of Type2 Diabetes using C4.5 Decision Tree |
title_fullStr | A Detection of Type2 Diabetes using C4.5 Decision Tree |
title_full_unstemmed | A Detection of Type2 Diabetes using C4.5 Decision Tree |
title_short | A Detection of Type2 Diabetes using C4.5 Decision Tree |
title_sort | detection of type2 diabetes using c4 5 decision tree |
topic | data mining type2 diabetes c4.5 decision tree |
url | http://jhbmi.ir/article-1-281-en.html |
work_keys_str_mv | AT hamedsabbaghgol adetectionoftype2diabetesusingc45decisiontree AT hamedsabbaghgol detectionoftype2diabetesusingc45decisiontree |