Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand
Generally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical metho...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2021-01-01
|
Series: | Informatics in Medicine Unlocked |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352914821002434 |
_version_ | 1818717205974482944 |
---|---|
author | Wichayaporn Thongpeth, M.N.S. Apiradee Lim, Ph.D. Akemat Wongpairin, M.P.H. Thaworn Thongpeth, M.D. Santhana Chaimontree, Ph.D. |
author_facet | Wichayaporn Thongpeth, M.N.S. Apiradee Lim, Ph.D. Akemat Wongpairin, M.P.H. Thaworn Thongpeth, M.D. Santhana Chaimontree, Ph.D. |
author_sort | Wichayaporn Thongpeth, M.N.S. |
collection | DOAJ |
description | Generally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical methods and machine learning for such data remains scattered. This study aimed to compare linear, penalized linear and machine learning models for their prediction performance of hospital visit costs from chronic disease, in Thailand. A total of 18,342 hospital visit records were obtained from Suratthani tertiary hospital in southern Thailand, which contained data from 2016 on chronic patients of Diagnosis-Related Groups (DRGs). The prediction performance on hospital visit costs by linear, penalized linear and machine learning models were compared using both original dataset and datasets expanded in size two- and four-fold by using bootstrap. The mean age of patients was 56.3 ± 22.6 years with 55.6% of visits by males. The median hospital cost was 16,662 Baht per visit. The random forest (RF) model had the best predictive performance of hospital visit costs for all sizes of dataset with the smallest prediction errors, whereas ridge linear regression had the poorest prediction performance with the largest prediction errors. Machine learning models had better prediction performance with enlarged sample sizes whereas linear and penalized linear models did not. On modeling big data for prediction, machine learning models are preferable, whereas linear and penalized linear models' predictions are not affected by increasing the sample size. |
first_indexed | 2024-12-17T19:31:28Z |
format | Article |
id | doaj.art-59b21c39dc4e4e46838fd58686d57cf1 |
institution | Directory Open Access Journal |
issn | 2352-9148 |
language | English |
last_indexed | 2024-12-17T19:31:28Z |
publishDate | 2021-01-01 |
publisher | Elsevier |
record_format | Article |
series | Informatics in Medicine Unlocked |
spelling | doaj.art-59b21c39dc4e4e46838fd58686d57cf12022-12-21T21:35:15ZengElsevierInformatics in Medicine Unlocked2352-91482021-01-0126100769Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in ThailandWichayaporn Thongpeth, M.N.S.0Apiradee Lim, Ph.D.1Akemat Wongpairin, M.P.H.2Thaworn Thongpeth, M.D.3Santhana Chaimontree, Ph.D.4Department of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandDepartment of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, Thailand; Corresponding author.Department of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandOrthopedic Surgery and Preventive Medicine, Suratthani Hospital, Ministry of Public Health, Mueang, Surat Thani, 84000, ThailandDepartment of Mathematics and Computer Science, Faculty of Science and Technology, Prince of Songkla University, Pattani Campus, Mueang, Pattani, 94000, ThailandGenerally, health care costs from chronic diseases have positive skew and this gives problems on using traditional statistical models. Machine learning is a conventional method producing accurate prediction with large sample size. However, much of the comparison performance between statistical methods and machine learning for such data remains scattered. This study aimed to compare linear, penalized linear and machine learning models for their prediction performance of hospital visit costs from chronic disease, in Thailand. A total of 18,342 hospital visit records were obtained from Suratthani tertiary hospital in southern Thailand, which contained data from 2016 on chronic patients of Diagnosis-Related Groups (DRGs). The prediction performance on hospital visit costs by linear, penalized linear and machine learning models were compared using both original dataset and datasets expanded in size two- and four-fold by using bootstrap. The mean age of patients was 56.3 ± 22.6 years with 55.6% of visits by males. The median hospital cost was 16,662 Baht per visit. The random forest (RF) model had the best predictive performance of hospital visit costs for all sizes of dataset with the smallest prediction errors, whereas ridge linear regression had the poorest prediction performance with the largest prediction errors. Machine learning models had better prediction performance with enlarged sample sizes whereas linear and penalized linear models did not. On modeling big data for prediction, machine learning models are preferable, whereas linear and penalized linear models' predictions are not affected by increasing the sample size.http://www.sciencedirect.com/science/article/pii/S2352914821002434Chronic diseaseMachine learningHealth care costPrediction performance |
spellingShingle | Wichayaporn Thongpeth, M.N.S. Apiradee Lim, Ph.D. Akemat Wongpairin, M.P.H. Thaworn Thongpeth, M.D. Santhana Chaimontree, Ph.D. Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand Informatics in Medicine Unlocked Chronic disease Machine learning Health care cost Prediction performance |
title | Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand |
title_full | Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand |
title_fullStr | Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand |
title_full_unstemmed | Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand |
title_short | Comparison of linear, penalized linear and machine learning models predicting hospital visit costs from chronic disease in Thailand |
title_sort | comparison of linear penalized linear and machine learning models predicting hospital visit costs from chronic disease in thailand |
topic | Chronic disease Machine learning Health care cost Prediction performance |
url | http://www.sciencedirect.com/science/article/pii/S2352914821002434 |
work_keys_str_mv | AT wichayapornthongpethmns comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand AT apiradeelimphd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand AT akematwongpairinmph comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand AT thawornthongpethmd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand AT santhanachaimontreephd comparisonoflinearpenalizedlinearandmachinelearningmodelspredictinghospitalvisitcostsfromchronicdiseaseinthailand |