Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data

Diabetes mellitus, a chronic metabolic disorder, continues to be a major public health issue around the world. It is estimated that one in every two diabetics is undiagnosed. Early diagnosis and management of diabetes can also prevent or delay the onset of complications. With the help of a variety o...

Full description

Bibliographic Details
Main Authors:	Md Shamim Reza, Ruhul Amin, Rubia Yasmin, Woomme Kulsum, Sabba Ruhi
Format:	Article
Language:	English
Published:	Elsevier 2024-01-01
Series:	Heliyon
Subjects:	Diabetes Classification Machine learning Deep learning Staking ensemble Early diagnosis
Online Access:	http://www.sciencedirect.com/science/article/pii/S240584402400567X

_version_	1797328648667136000
author	Md Shamim Reza Ruhul Amin Rubia Yasmin Woomme Kulsum Sabba Ruhi
author_facet	Md Shamim Reza Ruhul Amin Rubia Yasmin Woomme Kulsum Sabba Ruhi
author_sort	Md Shamim Reza
collection	DOAJ
description	Diabetes mellitus, a chronic metabolic disorder, continues to be a major public health issue around the world. It is estimated that one in every two diabetics is undiagnosed. Early diagnosis and management of diabetes can also prevent or delay the onset of complications. With the help of a variety of machine learning and deep learning models, stacking algorithms, and other techniques, our study's goal is to detect diseases early. In this study, we propose two stacking-based models for diabetes disease classification using a combination of the PIMA Indian diabetes dataset, simulated data, and additional data collected from a local healthcare facility. We use both the classical and deep neural network stacking ensemble methods to combine the predictions of multiple classification models and improve classification accuracy and robustness. In the evaluation protocol, we used both the train-test and cross-validation (CV) techniques to validate our proposed model. The highest accuracy is obtained by stacking ensemble with three NN architectures, resulting in an accuracy of 95.50 %, precision of 94 %, recall of 97 %, and f1-score of 96 % using 5-fold CV on simulation study. The stacked accuracy obtained from ML algorithms for the Pima Indian Diabetes dataset is 75.03 % using the train-test split protocol, while the accuracy obtained from the CV protocol is 77.10 % on the stacked model. The range of performance scores that outperformed the CV protocol 2.23 %–12 %. Our proposed method achieves a high accuracy range from 92 % to 95 %, precision, recall, and F1-score ranges from 88 % to 96 % using classical and deep neural network (NN)-based stacking method on the primary dataset. The proposed dataset and ensemble method could be useful in the early detection and treatment of diabetes, as well as in the advancement of machine learning and data analysis techniques in the healthcare industry.
first_indexed	2024-03-08T06:54:43Z
format	Article
id	doaj.art-2eb7147da9c0411e94ba3f332c13fac6
institution	Directory Open Access Journal
issn	2405-8440
language	English
last_indexed	2024-03-08T06:54:43Z
publishDate	2024-01-01
publisher	Elsevier
record_format	Article
series	Heliyon
spelling	doaj.art-2eb7147da9c0411e94ba3f332c13fac62024-02-03T06:37:49ZengElsevierHeliyon2405-84402024-01-01102e24536Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare dataMd Shamim Reza0Ruhul Amin1Rubia Yasmin2Woomme Kulsum3Sabba Ruhi4Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, BangladeshDepartment of Statistics, Pabna University of Science and Technology, Pabna, 6600, BangladeshDepartment of Statistics, Pabna University of Science and Technology, Pabna, 6600, BangladeshDepartment of Statistics, Pabna University of Science and Technology, Pabna, 6600, BangladeshCorresponding author.; Department of Statistics, Pabna University of Science and Technology, Pabna, 6600, BangladeshDiabetes mellitus, a chronic metabolic disorder, continues to be a major public health issue around the world. It is estimated that one in every two diabetics is undiagnosed. Early diagnosis and management of diabetes can also prevent or delay the onset of complications. With the help of a variety of machine learning and deep learning models, stacking algorithms, and other techniques, our study's goal is to detect diseases early. In this study, we propose two stacking-based models for diabetes disease classification using a combination of the PIMA Indian diabetes dataset, simulated data, and additional data collected from a local healthcare facility. We use both the classical and deep neural network stacking ensemble methods to combine the predictions of multiple classification models and improve classification accuracy and robustness. In the evaluation protocol, we used both the train-test and cross-validation (CV) techniques to validate our proposed model. The highest accuracy is obtained by stacking ensemble with three NN architectures, resulting in an accuracy of 95.50 %, precision of 94 %, recall of 97 %, and f1-score of 96 % using 5-fold CV on simulation study. The stacked accuracy obtained from ML algorithms for the Pima Indian Diabetes dataset is 75.03 % using the train-test split protocol, while the accuracy obtained from the CV protocol is 77.10 % on the stacked model. The range of performance scores that outperformed the CV protocol 2.23 %–12 %. Our proposed method achieves a high accuracy range from 92 % to 95 %, precision, recall, and F1-score ranges from 88 % to 96 % using classical and deep neural network (NN)-based stacking method on the primary dataset. The proposed dataset and ensemble method could be useful in the early detection and treatment of diabetes, as well as in the advancement of machine learning and data analysis techniques in the healthcare industry.http://www.sciencedirect.com/science/article/pii/S240584402400567XDiabetesClassificationMachine learningDeep learningStaking ensembleEarly diagnosis
spellingShingle	Md Shamim Reza Ruhul Amin Rubia Yasmin Woomme Kulsum Sabba Ruhi Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data Heliyon Diabetes Classification Machine learning Deep learning Staking ensemble Early diagnosis
title	Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data
title_full	Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data
title_fullStr	Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data
title_full_unstemmed	Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data
title_short	Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data
title_sort	improving diabetes disease patients classification using stacking ensemble method with pima and local healthcare data
topic	Diabetes Classification Machine learning Deep learning Staking ensemble Early diagnosis
url	http://www.sciencedirect.com/science/article/pii/S240584402400567X
work_keys_str_mv	AT mdshamimreza improvingdiabetesdiseasepatientsclassificationusingstackingensemblemethodwithpimaandlocalhealthcaredata AT ruhulamin improvingdiabetesdiseasepatientsclassificationusingstackingensemblemethodwithpimaandlocalhealthcaredata AT rubiayasmin improvingdiabetesdiseasepatientsclassificationusingstackingensemblemethodwithpimaandlocalhealthcaredata AT woommekulsum improvingdiabetesdiseasepatientsclassificationusingstackingensemblemethodwithpimaandlocalhealthcaredata AT sabbaruhi improvingdiabetesdiseasepatientsclassificationusingstackingensemblemethodwithpimaandlocalhealthcaredata

Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data

Similar Items