A comprehensive exploration to the machine learning techniques for diabetes identification

Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are...

Full description

Bibliographic Details
Main Authors: Wei, Sidong, Zhao, Xuejiao, Miao, Chunyan
Other Authors: School of Computer Science and Engineering
Format: Conference Paper
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/89478
http://hdl.handle.net/10220/47703
_version_ 1826114875896627200
author Wei, Sidong
Zhao, Xuejiao
Miao, Chunyan
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Wei, Sidong
Zhao, Xuejiao
Miao, Chunyan
author_sort Wei, Sidong
collection NTU
description Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are based on the Pima Indian diabetes data set. It’s a data set studying women in Pima Indian population started from 1965, where the onset rate for diabetes is comparatively high. Most of the research studies done before mainly focused on one or two particular complex technique to test the data, while a comprehensive research over many common techniques is missing. In this paper, we make a comprehensive exploration to the most popular techniques (e.g. DNN (Deep Neural Network), SVM (Support Vector Machine), etc.) used to identify diabetes and data preprocessing methods. Basically, we examine these techniques by the accuracy of cross-validation on the Pima Indian data set. We compare the accuracy of each classifier over several ways of data preprocessors and we modify the parameters to improve their accuracy. The best technique we find has 77.86% accuracy using 10-fold cross-validation. We also analyze the relevance between each feature with the classification result.
first_indexed 2024-10-01T03:46:21Z
format Conference Paper
id ntu-10356/89478
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:46:21Z
publishDate 2019
record_format dspace
spelling ntu-10356/894782020-03-07T11:48:46Z A comprehensive exploration to the machine learning techniques for diabetes identification Wei, Sidong Zhao, Xuejiao Miao, Chunyan School of Computer Science and Engineering 2018 IEEE 4th World Forum on Internet of Things (WF-IoT) NTU-UBC Research Centre of Excellence in Active Living for the Elderly Deep Neural Network DRNTU::Engineering::Computer science and engineering Machine Learning Diabetes mellitus, known as diabetes, is a group of metabolic disorders and has affected hundreds of millions of people. The detection of diabetes is of great importance, concerning its severe complications. There have been plenty of research studies about diabetes identification, many of which are based on the Pima Indian diabetes data set. It’s a data set studying women in Pima Indian population started from 1965, where the onset rate for diabetes is comparatively high. Most of the research studies done before mainly focused on one or two particular complex technique to test the data, while a comprehensive research over many common techniques is missing. In this paper, we make a comprehensive exploration to the most popular techniques (e.g. DNN (Deep Neural Network), SVM (Support Vector Machine), etc.) used to identify diabetes and data preprocessing methods. Basically, we examine these techniques by the accuracy of cross-validation on the Pima Indian data set. We compare the accuracy of each classifier over several ways of data preprocessors and we modify the parameters to improve their accuracy. The best technique we find has 77.86% accuracy using 10-fold cross-validation. We also analyze the relevance between each feature with the classification result. Accepted version 2019-02-19T06:34:52Z 2019-12-06T17:26:36Z 2019-02-19T06:34:52Z 2019-12-06T17:26:36Z 2018 Conference Paper Wei, S., Zhao, X., & Miao, C. (2018). A comprehensive exploration to the machine learning techniques for diabetes identification. 2018 IEEE 4th World Forum on Internet of Things (WF-IoT). doi:10.1109/WF-IoT.2018.8355130 https://hdl.handle.net/10356/89478 http://hdl.handle.net/10220/47703 10.1109/WF-IoT.2018.8355130 208286 en © 2018 Institute of Electrical and Electronics Engineers (IEEE). All rights reserved. This paper was published in 2018 IEEE 4th World Forum on Internet of Things (WF-IoT) and is made available with permission of Institute of Electrical and Electronics Engineers (IEEE). 5 p. application/pdf
spellingShingle Deep Neural Network
DRNTU::Engineering::Computer science and engineering
Machine Learning
Wei, Sidong
Zhao, Xuejiao
Miao, Chunyan
A comprehensive exploration to the machine learning techniques for diabetes identification
title A comprehensive exploration to the machine learning techniques for diabetes identification
title_full A comprehensive exploration to the machine learning techniques for diabetes identification
title_fullStr A comprehensive exploration to the machine learning techniques for diabetes identification
title_full_unstemmed A comprehensive exploration to the machine learning techniques for diabetes identification
title_short A comprehensive exploration to the machine learning techniques for diabetes identification
title_sort comprehensive exploration to the machine learning techniques for diabetes identification
topic Deep Neural Network
DRNTU::Engineering::Computer science and engineering
Machine Learning
url https://hdl.handle.net/10356/89478
http://hdl.handle.net/10220/47703
work_keys_str_mv AT weisidong acomprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification
AT zhaoxuejiao acomprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification
AT miaochunyan acomprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification
AT weisidong comprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification
AT zhaoxuejiao comprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification
AT miaochunyan comprehensiveexplorationtothemachinelearningtechniquesfordiabetesidentification