Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data

This study aims to compare the classification performance of statistical models on highly imbalanced kidney data. The health examination cohort database provided by the National Health Insurance Service in Korea is utilized to build models with various machine learning methods. The glomerular filtra...

Full description

Bibliographic Details
Main Authors:	Bomi Jeong, Hyunjeong Cho, Jieun Kim, Soon Kil Kwon, SeungWoo Hong, ChangSik Lee, TaeYeon Kim, Man Sik Park, Seoksu Hong, Tae-Young Heo
Format:	Article
Language:	English
Published:	MDPI AG 2020-06-01
Series:	Diagnostics
Subjects:	imbalanced data autoencoder machine learning chronic kidney disease national health screening
Online Access:	https://www.mdpi.com/2075-4418/10/6/415

_version_	1797564865891532800
author	Bomi Jeong Hyunjeong Cho Jieun Kim Soon Kil Kwon SeungWoo Hong ChangSik Lee TaeYeon Kim Man Sik Park Seoksu Hong Tae-Young Heo
author_facet	Bomi Jeong Hyunjeong Cho Jieun Kim Soon Kil Kwon SeungWoo Hong ChangSik Lee TaeYeon Kim Man Sik Park Seoksu Hong Tae-Young Heo
author_sort	Bomi Jeong
collection	DOAJ
description	This study aims to compare the classification performance of statistical models on highly imbalanced kidney data. The health examination cohort database provided by the National Health Insurance Service in Korea is utilized to build models with various machine learning methods. The glomerular filtration rate (GFR) is used to diagnose chronic kidney disease (CKD). It is calculated using the Modification of Diet in Renal Disease method and classified into five stages (1, 2, 3A and 3B, 4, and 5). Different CKD stages based on the estimated GFR are considered as six classes of the response variable. This study utilizes two representative generalized linear models for classification, namely, multinomial logistic regression (multinomial LR) and ordinal logistic regression (ordinal LR), as well as two machine learning models, namely, random forest (RF) and autoencoder (AE). The classification performance of the four models is compared in terms of accuracy, sensitivity, specificity, precision, and F1-Measure. To find the best model that classifies CKD stages correctly, the data are divided into a 10-fold dataset with the same rate for each CKD stage. Results indicate that RF and AE show better performance in accuracy than the multinomial and ordinal LR models when classifying the response variable. However, when a highly imbalanced dataset is modeled, the accuracy of the model performance can distort the actual performance. This occurs because accuracy is high even if a statistical model classifies a minority class into a majority class. To solve this problem in performance interpretation, we not only consider accuracy from the confusion matrix but also sensitivity, specificity, precision, and F-1 measure for each class. To present classification performance with a single value for each model, we calculate the macro-average and micro-weighted values for each model. We conclude that AE is the best model classifying CKD stages correctly for all performance indices.
first_indexed	2024-03-10T19:03:54Z
format	Article
id	doaj.art-1a2e02442dc44771919a6325fe0de7cb
institution	Directory Open Access Journal
issn	2075-4418
language	English
last_indexed	2024-03-10T19:03:54Z
publishDate	2020-06-01
publisher	MDPI AG
record_format	Article
series	Diagnostics
spelling	doaj.art-1a2e02442dc44771919a6325fe0de7cb2023-11-20T04:13:09ZengMDPI AGDiagnostics2075-44182020-06-0110641510.3390/diagnostics10060415Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney DataBomi Jeong0Hyunjeong Cho1Jieun Kim2Soon Kil Kwon3SeungWoo Hong4ChangSik Lee5TaeYeon Kim6Man Sik Park7Seoksu Hong8Tae-Young Heo9Department of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaDepartment of Internal Medicine, Chungbuk National University College of Medicine, Chungbuk 28644, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaDepartment of Internal Medicine, Chungbuk National University College of Medicine, Chungbuk 28644, KoreaIntelligent Network Research Section, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, KoreaIntelligent Network Research Section, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, KoreaIntelligent Network Research Section, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon 34129, KoreaDepartment of Statistics, Sungshin Women’s University, Seoul 02844, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaThis study aims to compare the classification performance of statistical models on highly imbalanced kidney data. The health examination cohort database provided by the National Health Insurance Service in Korea is utilized to build models with various machine learning methods. The glomerular filtration rate (GFR) is used to diagnose chronic kidney disease (CKD). It is calculated using the Modification of Diet in Renal Disease method and classified into five stages (1, 2, 3A and 3B, 4, and 5). Different CKD stages based on the estimated GFR are considered as six classes of the response variable. This study utilizes two representative generalized linear models for classification, namely, multinomial logistic regression (multinomial LR) and ordinal logistic regression (ordinal LR), as well as two machine learning models, namely, random forest (RF) and autoencoder (AE). The classification performance of the four models is compared in terms of accuracy, sensitivity, specificity, precision, and F1-Measure. To find the best model that classifies CKD stages correctly, the data are divided into a 10-fold dataset with the same rate for each CKD stage. Results indicate that RF and AE show better performance in accuracy than the multinomial and ordinal LR models when classifying the response variable. However, when a highly imbalanced dataset is modeled, the accuracy of the model performance can distort the actual performance. This occurs because accuracy is high even if a statistical model classifies a minority class into a majority class. To solve this problem in performance interpretation, we not only consider accuracy from the confusion matrix but also sensitivity, specificity, precision, and F-1 measure for each class. To present classification performance with a single value for each model, we calculate the macro-average and micro-weighted values for each model. We conclude that AE is the best model classifying CKD stages correctly for all performance indices.https://www.mdpi.com/2075-4418/10/6/415imbalanced dataautoencodermachine learningchronic kidney diseasenational health screening
spellingShingle	Bomi Jeong Hyunjeong Cho Jieun Kim Soon Kil Kwon SeungWoo Hong ChangSik Lee TaeYeon Kim Man Sik Park Seoksu Hong Tae-Young Heo Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data Diagnostics imbalanced data autoencoder machine learning chronic kidney disease national health screening
title	Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data
title_full	Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data
title_fullStr	Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data
title_full_unstemmed	Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data
title_short	Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data
title_sort	comparison between statistical models and machine learning methods on classification for highly imbalanced multiclass kidney data
topic	imbalanced data autoencoder machine learning chronic kidney disease national health screening
url	https://www.mdpi.com/2075-4418/10/6/415
work_keys_str_mv	AT bomijeong comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT hyunjeongcho comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT jieunkim comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT soonkilkwon comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT seungwoohong comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT changsiklee comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT taeyeonkim comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT mansikpark comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT seoksuhong comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata AT taeyoungheo comparisonbetweenstatisticalmodelsandmachinelearningmethodsonclassificationforhighlyimbalancedmulticlasskidneydata

Comparison between Statistical Models and Machine Learning Methods on Classification for Highly Imbalanced Multiclass Kidney Data

Similar Items