Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand

Generally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical metho...

Full description

Bibliographic Details
Main Authors: Wichayaporn Thongpeth, M.N.S., Apiradee Lim, Ph.D., Akemat Wongpairin, M.P.H., Thaworn Thongpeth, M.D., Santhana Chaimontree, Ph.D.
Format: Article
Language:English
Published: Elsevier 2021-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914821002434
_version_ 1818717205974482944
author Wichayaporn Thongpeth, M.N.S.
Apiradee Lim, Ph.D.
Akemat Wongpairin, M.P.H.
Thaworn Thongpeth, M.D.
Santhana Chaimontree, Ph.D.
author_facet Wichayaporn Thongpeth, M.N.S.
Apiradee Lim, Ph.D.
Akemat Wongpairin, M.P.H.
Thaworn Thongpeth, M.D.
Santhana Chaimontree, Ph.D.
author_sort Wichayaporn Thongpeth, M.N.S.
collection DOAJ
description Generally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical methods and machine learning for such data remains scattered. This study aimed to compare linear, penalized linear and machine learning models for their prediction performance of hospital visit costs from chronic disease, in Thailand. A total of 18,342 hospital visit records were obtained from Suratthani tertiary hospital in southern Thailand, which contained data from 2016 on chronic patients of Diagnosis-Related Groups (DRGs). The prediction performance on hospital visit costs by linear, penalized linear and machine learning models were compared using both original dataset and datasets expanded in size two- and four-fold by using bootstrap. The mean age of patients was 56.3 ± 22.6 years with 55.6% of visits by males. The median hospital cost was 16,662 Baht per visit. The random forest (RF) model had the best predictive performance of hospital visit costs for all sizes of dataset with the smallest prediction errors, whereas ridge linear regression had the poorest prediction performance with the largest prediction errors. Machine learning models had better prediction performance with enlarged sample sizes whereas linear and penalized linear models did not. On modeling big data for prediction, machine learning models are preferable, whereas linear and penalized linear models' predictions are not affected by increasing the sample size.
first_indexed 2024-12-17T19:31:28Z
format Article
id doaj.art-59b21c39dc4e4e46838fd58686d57cf1
institution Directory Open Access Journal
issn 2352-9148
language English
last_indexed 2024-12-17T19:31:28Z
publishDate 2021-01-01
publisher Elsevier
record_format Article
series Informatics in Medicine Unlocked
spelling doaj.art-59b21c39dc4e4e46838fd58686d57cf12022-12-21T21:35:15ZengElsevierInformatics in Medicine Unlocked2352-91482021-01-0126100769Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in ThailandWichayaporn Thongpeth, M.N.S.0Apiradee Lim, Ph.D.1Akemat Wongpairin, M.P.H.2Thaworn Thongpeth, M.D.3Santhana Chaimontree, Ph.D.4Department of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandDepartment of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, Thailand; Corresponding author.Department of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandOrthopedic Surgery and Preventive Medicine, Suratthani Hospital, Ministry of Public Health, Mueang, Surat Thani, 84000, ThailandDepartment of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandGenerally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical methods and machine learning for such data remains scattered. This study aimed to compare linear, penalized linear and machine learning models for their prediction performance of hospital visit costs from chronic disease, in Thailand. A total of 18,342 hospital visit records were obtained from Suratthani tertiary hospital in southern Thailand, which contained data from 2016 on chronic patients of Diagnosis-Related Groups (DRGs). The prediction performance on hospital visit costs by linear, penalized linear and machine learning models were compared using both original dataset and datasets expanded in size two- and four-fold by using bootstrap. The mean age of patients was 56.3 ± 22.6 years with 55.6% of visits by males. The median hospital cost was 16,662 Baht per visit. The random forest (RF) model had the best predictive performance of hospital visit costs for all sizes of dataset with the smallest prediction errors, whereas ridge linear regression had the poorest prediction performance with the largest prediction errors. Machine learning models had better prediction performance with enlarged sample sizes whereas linear and penalized linear models did not. On modeling big data for prediction, machine learning models are preferable, whereas linear and penalized linear models' predictions are not affected by increasing the sample size.http://www.sciencedirect.com/science/article/pii/S2352914821002434Chronic diseaseMachine learningHealth care costPrediction performance
spellingShingle Wichayaporn Thongpeth, M.N.S.
Apiradee Lim, Ph.D.
Akemat Wongpairin, M.P.H.
Thaworn Thongpeth, M.D.
Santhana Chaimontree, Ph.D.
Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
Informatics in Medicine Unlocked
Chronic disease
Machine learning
Health care cost
Prediction performance
title Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
title_full Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
title_fullStr Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
title_full_unstemmed Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
title_short Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
title_sort comparison of linear penalized linear and machine learning models predicting hospital visit costs from chronic disease in thailand
topic Chronic disease
Machine learning
Health care cost
Prediction performance
url http://www.sciencedirect.com/science/article/pii/S2352914821002434
work_keys_str_mv AT wichayapornthongpethmns comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand
AT apiradeelimphd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand
AT akematwongpairinmph comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand
AT thawornthongpethmd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand
AT santhanachaimontreephd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand