Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models

Germination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as we...

Full description

Bibliographic Details
Main Author:	Alberto Gianinetti
Format:	Article
Language:	English
Published:	MDPI AG 2020-01-01
Series:	Data
Subjects:	binomial data germination test over-dispersion under-dispersion random effects
Online Access:	https://www.mdpi.com/2306-5729/5/1/6

_version_	1798041369081544704
author	Alberto Gianinetti
author_facet	Alberto Gianinetti
author_sort	Alberto Gianinetti
collection	DOAJ
description	Germination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as well as longitudinal studies of germination time-courses. Germination indices (i.e., single-value parameters summarizing the results of a germination assay by combining the level and rapidity of germination) and other data with a Gaussian error distribution can be analyzed too. There are, however, different kinds of GzLMMs: Conditional (i.e., random effects are modeled as deviations from the general intercept with a specific covariance structure), marginal (i.e., random effects are modeled solely as a variance/covariance structure of the error terms), and quasi-marginal (some random effects are modeled as deviations from the intercept and some are modeled as a covariance structure of the error terms) models can be applied to the same data. It is shown that: (a) For germination data, conditional, marginal, and quasi-marginal GzLMMs tend to converge to a similar inference; (b) conditional models are the first choice for FGP; (c) marginal or quasi-marginal models are more suited for longitudinal studies, although conditional models lead to a congruent inference; (d) in general, common random factors are better dealt with as random intercepts, whereas serial correlation is easier to model in terms of the covariance structure of the error terms; (e) germination indices are not binomial and can be easier to analyze with a marginal model; (f) in boundary conditions (when some means approach 0% or 100%), conditional models with an integral approximation of true likelihood are more appropriate; in non-boundary conditions, (g) germination data can be fitted with default pseudo-likelihood estimation techniques, on the basis of the SAS-based code templates provided here; (h) GzLMMs are remarkably good for the analysis of germination data except if some means are 0% or 100%. In this case, alternative statistical approaches may be used, such as survival analysis or linear mixed models (LMMs) with transformed data, unless an ad hoc data adjustment in estimates of limit means is considered, either experimentally or computationally. This review is intended as a basic tutorial for the application of GzLMMs, and is, therefore, of interest primarily to researchers in the agricultural sciences.
first_indexed	2024-04-11T22:20:35Z
format	Article
id	doaj.art-f82ee0dab26e481f84a3258d5e0dd406
institution	Directory Open Access Journal
issn	2306-5729
language	English
last_indexed	2024-04-11T22:20:35Z
publishDate	2020-01-01
publisher	MDPI AG
record_format	Article
series	Data
spelling	doaj.art-f82ee0dab26e481f84a3258d5e0dd4062022-12-22T04:00:13ZengMDPI AGData2306-57292020-01-0151610.3390/data5010006data5010006Basic Features of the Analysis of Germination Data with Generalized Linear Mixed ModelsAlberto Gianinetti0Council for Agricultural Research and Economics—Research Centre for Genomics and Bioinformatics, via S. Protaso 302, 29017 Fiorenzuola d’Arda (PC), ItalyGermination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as well as longitudinal studies of germination time-courses. Germination indices (i.e., single-value parameters summarizing the results of a germination assay by combining the level and rapidity of germination) and other data with a Gaussian error distribution can be analyzed too. There are, however, different kinds of GzLMMs: Conditional (i.e., random effects are modeled as deviations from the general intercept with a specific covariance structure), marginal (i.e., random effects are modeled solely as a variance/covariance structure of the error terms), and quasi-marginal (some random effects are modeled as deviations from the intercept and some are modeled as a covariance structure of the error terms) models can be applied to the same data. It is shown that: (a) For germination data, conditional, marginal, and quasi-marginal GzLMMs tend to converge to a similar inference; (b) conditional models are the first choice for FGP; (c) marginal or quasi-marginal models are more suited for longitudinal studies, although conditional models lead to a congruent inference; (d) in general, common random factors are better dealt with as random intercepts, whereas serial correlation is easier to model in terms of the covariance structure of the error terms; (e) germination indices are not binomial and can be easier to analyze with a marginal model; (f) in boundary conditions (when some means approach 0% or 100%), conditional models with an integral approximation of true likelihood are more appropriate; in non-boundary conditions, (g) germination data can be fitted with default pseudo-likelihood estimation techniques, on the basis of the SAS-based code templates provided here; (h) GzLMMs are remarkably good for the analysis of germination data except if some means are 0% or 100%. In this case, alternative statistical approaches may be used, such as survival analysis or linear mixed models (LMMs) with transformed data, unless an ad hoc data adjustment in estimates of limit means is considered, either experimentally or computationally. This review is intended as a basic tutorial for the application of GzLMMs, and is, therefore, of interest primarily to researchers in the agricultural sciences.https://www.mdpi.com/2306-5729/5/1/6binomial datagermination testover-dispersionunder-dispersionrandom effects
spellingShingle	Alberto Gianinetti Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models Data binomial data germination test over-dispersion under-dispersion random effects
title	Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_full	Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_fullStr	Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_full_unstemmed	Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_short	Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_sort	basic features of the analysis of germination data with generalized linear mixed models
topic	binomial data germination test over-dispersion under-dispersion random effects
url	https://www.mdpi.com/2306-5729/5/1/6
work_keys_str_mv	AT albertogianinetti basicfeaturesoftheanalysisofgerminationdatawithgeneralizedlinearmixedmodels

Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models

Similar Items