Multivariate Generalized Linear Mixed Models for Count Data

Univariate regression models have rich literature for counting data. However, this is not the case for multivariate count data. Therefore, we present the Multivariate Generalized Linear Mixed Models framework that deals with a multivariate set of responses, measuring the correlation between them th...

Full description

Bibliographic Details
Main Authors: Guilherme Parreira da Silva, Henrique Aparecido Laureano, Ricardo Rasmussen Petterle, Paulo Justiniano Ribeiro Júnior, Wagner Hugo Bonat
Format: Article
Language:English
Published: Austrian Statistical Society 2024-01-01
Series:Austrian Journal of Statistics
Online Access:https://www.ajs.or.at/index.php/ajs/article/view/1574
_version_ 1827379039531171840
author Guilherme Parreira da Silva
Henrique Aparecido Laureano
Ricardo Rasmussen Petterle
Paulo Justiniano Ribeiro Júnior
Wagner Hugo Bonat
author_facet Guilherme Parreira da Silva
Henrique Aparecido Laureano
Ricardo Rasmussen Petterle
Paulo Justiniano Ribeiro Júnior
Wagner Hugo Bonat
author_sort Guilherme Parreira da Silva
collection DOAJ
description Univariate regression models have rich literature for counting data. However, this is not the case for multivariate count data. Therefore, we present the Multivariate Generalized Linear Mixed Models framework that deals with a multivariate set of responses, measuring the correlation between them through random effects that follows a multivariate normal distribution. This model is based on a GLMM with a random intercept and the estimation process remains the same as a standard GLMM with random effects integrated out via Laplace approximation. We efficiently implemented this model through the TMB package available in R. We used Poisson, negative binomial (NB), and COM-Poisson distributions. To assess the estimator properties, we conducted a simulation study considering four different sample sizes and three different correlation values for each distribution. We achieved unbiased and consistent estimators for Poisson and NB distributions; for COM-Poisson estimators were consistent, but biased, especially for dispersion, variance, and correlation parameter estimators. These models were applied to two datasets. The first concerns a sample from 30 different sites collected in Australia where the number of times each one of the 41 different ant species was registered; which results in an impressive 820 variance-covariance and 41 dispersion parameters are estimated simultaneously, let alone the regression parameters. The second is from the Australia Health Survey with 5 response variables and 5190 respondents. These datasets can be considered overdispersed by the generalized dispersion index. The COM-Poisson model overcame the other two competitors considering three goodness-of-fit indexes, AIC, BIC, and maximized log-likelihood values. As a result, it estimated parameters with smaller standard errors and a greater number of significant correlation coefficients. Therefore, the proposed model is capable of dealing with multivariate count data, either under- equi- or overdispersed responses, and measuring any kind of correlation between them taking into account the effects of the covariates.
first_indexed 2024-03-08T13:05:24Z
format Article
id doaj.art-967f2bbcb5564896a2e8677f84387e4e
institution Directory Open Access Journal
issn 1026-597X
language English
last_indexed 2024-03-08T13:05:24Z
publishDate 2024-01-01
publisher Austrian Statistical Society
record_format Article
series Austrian Journal of Statistics
spelling doaj.art-967f2bbcb5564896a2e8677f84387e4e2024-01-18T20:26:44ZengAustrian Statistical SocietyAustrian Journal of Statistics1026-597X2024-01-0153110.17713/ajs.v53i1.1574Multivariate Generalized Linear Mixed Models for Count DataGuilherme Parreira da Silva0Henrique Aparecido Laureano1Ricardo Rasmussen Petterle2Paulo Justiniano Ribeiro Júnior3Wagner Hugo Bonat4Federal University of ParanáInstituto de Pesquisa Pelé Pequeno PríncipeFederal University of ParanáFederal University of ParanáFederal University of Paraná Univariate regression models have rich literature for counting data. However, this is not the case for multivariate count data. Therefore, we present the Multivariate Generalized Linear Mixed Models framework that deals with a multivariate set of responses, measuring the correlation between them through random effects that follows a multivariate normal distribution. This model is based on a GLMM with a random intercept and the estimation process remains the same as a standard GLMM with random effects integrated out via Laplace approximation. We efficiently implemented this model through the TMB package available in R. We used Poisson, negative binomial (NB), and COM-Poisson distributions. To assess the estimator properties, we conducted a simulation study considering four different sample sizes and three different correlation values for each distribution. We achieved unbiased and consistent estimators for Poisson and NB distributions; for COM-Poisson estimators were consistent, but biased, especially for dispersion, variance, and correlation parameter estimators. These models were applied to two datasets. The first concerns a sample from 30 different sites collected in Australia where the number of times each one of the 41 different ant species was registered; which results in an impressive 820 variance-covariance and 41 dispersion parameters are estimated simultaneously, let alone the regression parameters. The second is from the Australia Health Survey with 5 response variables and 5190 respondents. These datasets can be considered overdispersed by the generalized dispersion index. The COM-Poisson model overcame the other two competitors considering three goodness-of-fit indexes, AIC, BIC, and maximized log-likelihood values. As a result, it estimated parameters with smaller standard errors and a greater number of significant correlation coefficients. Therefore, the proposed model is capable of dealing with multivariate count data, either under- equi- or overdispersed responses, and measuring any kind of correlation between them taking into account the effects of the covariates. https://www.ajs.or.at/index.php/ajs/article/view/1574
spellingShingle Guilherme Parreira da Silva
Henrique Aparecido Laureano
Ricardo Rasmussen Petterle
Paulo Justiniano Ribeiro Júnior
Wagner Hugo Bonat
Multivariate Generalized Linear Mixed Models for Count Data
Austrian Journal of Statistics
title Multivariate Generalized Linear Mixed Models for Count Data
title_full Multivariate Generalized Linear Mixed Models for Count Data
title_fullStr Multivariate Generalized Linear Mixed Models for Count Data
title_full_unstemmed Multivariate Generalized Linear Mixed Models for Count Data
title_short Multivariate Generalized Linear Mixed Models for Count Data
title_sort multivariate generalized linear mixed models for count data
url https://www.ajs.or.at/index.php/ajs/article/view/1574
work_keys_str_mv AT guilhermeparreiradasilva multivariategeneralizedlinearmixedmodelsforcountdata
AT henriqueaparecidolaureano multivariategeneralizedlinearmixedmodelsforcountdata
AT ricardorasmussenpetterle multivariategeneralizedlinearmixedmodelsforcountdata
AT paulojustinianoribeirojunior multivariategeneralizedlinearmixedmodelsforcountdata
AT wagnerhugobonat multivariategeneralizedlinearmixedmodelsforcountdata