Count data models for outpatient health services utilisation

Abstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare...

Full description

Bibliographic Details
Main Authors: Nurul Salwana Abu Bakar, Jabrullah Ab Hamid, Mohd Shaiful Jefri Mohd Nor Sham, Mohd Nor Sham, Anis Syakira Jailani
Format: Article
Language:English
Published: BMC 2022-10-01
Series:BMC Medical Research Methodology
Subjects:
Online Access:https://doi.org/10.1186/s12874-022-01733-3
_version_ 1817979647383568384
author Nurul Salwana Abu Bakar
Jabrullah Ab Hamid
Mohd Shaiful Jefri Mohd Nor Sham
Mohd Nor Sham
Anis Syakira Jailani
author_facet Nurul Salwana Abu Bakar
Jabrullah Ab Hamid
Mohd Shaiful Jefri Mohd Nor Sham
Mohd Nor Sham
Anis Syakira Jailani
author_sort Nurul Salwana Abu Bakar
collection DOAJ
description Abstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare utilisation using data from the Malaysian National Health and Morbidity Survey 2019 (NHMS 2019) and utilisation factors among adults in Malaysia. Methods The frequency of outpatient visits is the dependent variable, and instrumental variable selection is based on Andersen’s model. Six different models were used: ordinary least squares (OLS), Poisson regression, negative binomial regression (NB), inflated models: zero-inflated Poisson, marginalized-zero-inflated negative binomial (MZINB), and hurdle model. Identification of the best-fitting model was based on model selection criteria, goodness-of-fit and statistical test of the factors associated with outpatient visits. Results The frequency of zero was 90%. Of the sample, 8.35% of adults utilized healthcare services only once, and 1.04% utilized them twice. The mean-variance value varied between 0.14 and 0.39. Across six models, the zero-inflated model (ZIM) possesses the smallest log-likelihood, Akaike information criterion, Bayesian information criterion, and a positive Vuong corrected value. Fourteen instrumental variables, five predisposing factors, six enablers, and three need factors were identified. Data overdispersion is characterized by excess zeros, a large mean to variance value, and skewed positive tails. We assumed frequency and true zeros throughout the study reference period. ZIM is the best-fitting model based on the model selection criteria, smallest Root Mean Square Error (RMSE) and higher R2. Both Vuong corrected and uncorrected values with different Stata commands yielded positive values with small differences. Conclusion State as a place of residence, ethnicity, household income quintile, and health needs were significantly associated with healthcare utilisation. Our findings suggest using ZIM over traditional OLS. This study encourages the use of this count data model as it has a better fit, is easy to interpret, and has appropriate assumptions based on the survey methodology.
first_indexed 2024-04-13T22:45:13Z
format Article
id doaj.art-162d307a459648e5b5aa8a42b42e41bd
institution Directory Open Access Journal
issn 1471-2288
language English
last_indexed 2024-04-13T22:45:13Z
publishDate 2022-10-01
publisher BMC
record_format Article
series BMC Medical Research Methodology
spelling doaj.art-162d307a459648e5b5aa8a42b42e41bd2022-12-22T02:26:25ZengBMCBMC Medical Research Methodology1471-22882022-10-012211910.1186/s12874-022-01733-3Count data models for outpatient health services utilisationNurul Salwana Abu Bakar0Jabrullah Ab Hamid1Mohd Shaiful Jefri Mohd Nor Sham2Mohd Nor Sham3Anis Syakira Jailani4Centre for Health Policy Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Equity Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Economics Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Economics Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Outcome Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthAbstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare utilisation using data from the Malaysian National Health and Morbidity Survey 2019 (NHMS 2019) and utilisation factors among adults in Malaysia. Methods The frequency of outpatient visits is the dependent variable, and instrumental variable selection is based on Andersen’s model. Six different models were used: ordinary least squares (OLS), Poisson regression, negative binomial regression (NB), inflated models: zero-inflated Poisson, marginalized-zero-inflated negative binomial (MZINB), and hurdle model. Identification of the best-fitting model was based on model selection criteria, goodness-of-fit and statistical test of the factors associated with outpatient visits. Results The frequency of zero was 90%. Of the sample, 8.35% of adults utilized healthcare services only once, and 1.04% utilized them twice. The mean-variance value varied between 0.14 and 0.39. Across six models, the zero-inflated model (ZIM) possesses the smallest log-likelihood, Akaike information criterion, Bayesian information criterion, and a positive Vuong corrected value. Fourteen instrumental variables, five predisposing factors, six enablers, and three need factors were identified. Data overdispersion is characterized by excess zeros, a large mean to variance value, and skewed positive tails. We assumed frequency and true zeros throughout the study reference period. ZIM is the best-fitting model based on the model selection criteria, smallest Root Mean Square Error (RMSE) and higher R2. Both Vuong corrected and uncorrected values with different Stata commands yielded positive values with small differences. Conclusion State as a place of residence, ethnicity, household income quintile, and health needs were significantly associated with healthcare utilisation. Our findings suggest using ZIM over traditional OLS. This study encourages the use of this count data model as it has a better fit, is easy to interpret, and has appropriate assumptions based on the survey methodology.https://doi.org/10.1186/s12874-022-01733-3Healthcare utilisationOutpatientCount modelZero-inflated modelHealth behavioral model
spellingShingle Nurul Salwana Abu Bakar
Jabrullah Ab Hamid
Mohd Shaiful Jefri Mohd Nor Sham
Mohd Nor Sham
Anis Syakira Jailani
Count data models for outpatient health services utilisation
BMC Medical Research Methodology
Healthcare utilisation
Outpatient
Count model
Zero-inflated model
Health behavioral model
title Count data models for outpatient health services utilisation
title_full Count data models for outpatient health services utilisation
title_fullStr Count data models for outpatient health services utilisation
title_full_unstemmed Count data models for outpatient health services utilisation
title_short Count data models for outpatient health services utilisation
title_sort count data models for outpatient health services utilisation
topic Healthcare utilisation
Outpatient
Count model
Zero-inflated model
Health behavioral model
url https://doi.org/10.1186/s12874-022-01733-3
work_keys_str_mv AT nurulsalwanaabubakar countdatamodelsforoutpatienthealthservicesutilisation
AT jabrullahabhamid countdatamodelsforoutpatienthealthservicesutilisation
AT mohdshaifuljefrimohdnorsham countdatamodelsforoutpatienthealthservicesutilisation
AT mohdnorsham countdatamodelsforoutpatienthealthservicesutilisation
AT anissyakirajailani countdatamodelsforoutpatienthealthservicesutilisation