Count data models for outpatient health services utilisation
Abstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-10-01
|
Series: | BMC Medical Research Methodology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12874-022-01733-3 |
_version_ | 1817979647383568384 |
---|---|
author | Nurul Salwana Abu Bakar Jabrullah Ab Hamid Mohd Shaiful Jefri Mohd Nor Sham Mohd Nor Sham Anis Syakira Jailani |
author_facet | Nurul Salwana Abu Bakar Jabrullah Ab Hamid Mohd Shaiful Jefri Mohd Nor Sham Mohd Nor Sham Anis Syakira Jailani |
author_sort | Nurul Salwana Abu Bakar |
collection | DOAJ |
description | Abstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare utilisation using data from the Malaysian National Health and Morbidity Survey 2019 (NHMS 2019) and utilisation factors among adults in Malaysia. Methods The frequency of outpatient visits is the dependent variable, and instrumental variable selection is based on Andersen’s model. Six different models were used: ordinary least squares (OLS), Poisson regression, negative binomial regression (NB), inflated models: zero-inflated Poisson, marginalized-zero-inflated negative binomial (MZINB), and hurdle model. Identification of the best-fitting model was based on model selection criteria, goodness-of-fit and statistical test of the factors associated with outpatient visits. Results The frequency of zero was 90%. Of the sample, 8.35% of adults utilized healthcare services only once, and 1.04% utilized them twice. The mean-variance value varied between 0.14 and 0.39. Across six models, the zero-inflated model (ZIM) possesses the smallest log-likelihood, Akaike information criterion, Bayesian information criterion, and a positive Vuong corrected value. Fourteen instrumental variables, five predisposing factors, six enablers, and three need factors were identified. Data overdispersion is characterized by excess zeros, a large mean to variance value, and skewed positive tails. We assumed frequency and true zeros throughout the study reference period. ZIM is the best-fitting model based on the model selection criteria, smallest Root Mean Square Error (RMSE) and higher R2. Both Vuong corrected and uncorrected values with different Stata commands yielded positive values with small differences. Conclusion State as a place of residence, ethnicity, household income quintile, and health needs were significantly associated with healthcare utilisation. Our findings suggest using ZIM over traditional OLS. This study encourages the use of this count data model as it has a better fit, is easy to interpret, and has appropriate assumptions based on the survey methodology. |
first_indexed | 2024-04-13T22:45:13Z |
format | Article |
id | doaj.art-162d307a459648e5b5aa8a42b42e41bd |
institution | Directory Open Access Journal |
issn | 1471-2288 |
language | English |
last_indexed | 2024-04-13T22:45:13Z |
publishDate | 2022-10-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Research Methodology |
spelling | doaj.art-162d307a459648e5b5aa8a42b42e41bd2022-12-22T02:26:25ZengBMCBMC Medical Research Methodology1471-22882022-10-012211910.1186/s12874-022-01733-3Count data models for outpatient health services utilisationNurul Salwana Abu Bakar0Jabrullah Ab Hamid1Mohd Shaiful Jefri Mohd Nor Sham2Mohd Nor Sham3Anis Syakira Jailani4Centre for Health Policy Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Equity Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Economics Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Economics Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthCentre for Health Outcome Research, Institute for Health Systems Research, National Institutes of Health, Ministry of HealthAbstract Background Count data from the national survey captures healthcare utilisation within a specific reference period, resulting in excess zeros and skewed positive tails. Often, it is modelled using count data models. This study aims to identify the best-fitting model for outpatient healthcare utilisation using data from the Malaysian National Health and Morbidity Survey 2019 (NHMS 2019) and utilisation factors among adults in Malaysia. Methods The frequency of outpatient visits is the dependent variable, and instrumental variable selection is based on Andersen’s model. Six different models were used: ordinary least squares (OLS), Poisson regression, negative binomial regression (NB), inflated models: zero-inflated Poisson, marginalized-zero-inflated negative binomial (MZINB), and hurdle model. Identification of the best-fitting model was based on model selection criteria, goodness-of-fit and statistical test of the factors associated with outpatient visits. Results The frequency of zero was 90%. Of the sample, 8.35% of adults utilized healthcare services only once, and 1.04% utilized them twice. The mean-variance value varied between 0.14 and 0.39. Across six models, the zero-inflated model (ZIM) possesses the smallest log-likelihood, Akaike information criterion, Bayesian information criterion, and a positive Vuong corrected value. Fourteen instrumental variables, five predisposing factors, six enablers, and three need factors were identified. Data overdispersion is characterized by excess zeros, a large mean to variance value, and skewed positive tails. We assumed frequency and true zeros throughout the study reference period. ZIM is the best-fitting model based on the model selection criteria, smallest Root Mean Square Error (RMSE) and higher R2. Both Vuong corrected and uncorrected values with different Stata commands yielded positive values with small differences. Conclusion State as a place of residence, ethnicity, household income quintile, and health needs were significantly associated with healthcare utilisation. Our findings suggest using ZIM over traditional OLS. This study encourages the use of this count data model as it has a better fit, is easy to interpret, and has appropriate assumptions based on the survey methodology.https://doi.org/10.1186/s12874-022-01733-3Healthcare utilisationOutpatientCount modelZero-inflated modelHealth behavioral model |
spellingShingle | Nurul Salwana Abu Bakar Jabrullah Ab Hamid Mohd Shaiful Jefri Mohd Nor Sham Mohd Nor Sham Anis Syakira Jailani Count data models for outpatient health services utilisation BMC Medical Research Methodology Healthcare utilisation Outpatient Count model Zero-inflated model Health behavioral model |
title | Count data models for outpatient health services utilisation |
title_full | Count data models for outpatient health services utilisation |
title_fullStr | Count data models for outpatient health services utilisation |
title_full_unstemmed | Count data models for outpatient health services utilisation |
title_short | Count data models for outpatient health services utilisation |
title_sort | count data models for outpatient health services utilisation |
topic | Healthcare utilisation Outpatient Count model Zero-inflated model Health behavioral model |
url | https://doi.org/10.1186/s12874-022-01733-3 |
work_keys_str_mv | AT nurulsalwanaabubakar countdatamodelsforoutpatienthealthservicesutilisation AT jabrullahabhamid countdatamodelsforoutpatienthealthservicesutilisation AT mohdshaifuljefrimohdnorsham countdatamodelsforoutpatienthealthservicesutilisation AT mohdnorsham countdatamodelsforoutpatienthealthservicesutilisation AT anissyakirajailani countdatamodelsforoutpatienthealthservicesutilisation |