Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models

Germination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as we...

Full description

Bibliographic Details
Main Author: Alberto Gianinetti
Format: Article
Language:English
Published: MDPI AG 2020-01-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/5/1/6
_version_ 1798041369081544704
author Alberto Gianinetti
author_facet Alberto Gianinetti
author_sort Alberto Gianinetti
collection DOAJ
description Germination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as well as longitudinal studies of germination time-courses. Germination indices (i.e., single-value parameters summarizing the results of a germination assay by combining the level and rapidity of germination) and other data with a Gaussian error distribution can be analyzed too. There are, however, different kinds of GzLMMs: Conditional (i.e., random effects are modeled as deviations from the general intercept with a specific covariance structure), marginal (i.e., random effects are modeled solely as a variance/covariance structure of the error terms), and quasi-marginal (some random effects are modeled as deviations from the intercept and some are modeled as a covariance structure of the error terms) models can be applied to the same data. It is shown that: (a) For germination data, conditional, marginal, and quasi-marginal GzLMMs tend to converge to a similar inference; (b) conditional models are the first choice for FGP; (c) marginal or quasi-marginal models are more suited for longitudinal studies, although conditional models lead to a congruent inference; (d) in general, common random factors are better dealt with as random intercepts, whereas serial correlation is easier to model in terms of the covariance structure of the error terms; (e) germination indices are not binomial and can be easier to analyze with a marginal model; (f) in boundary conditions (when some means approach 0% or 100%), conditional models with an integral approximation of true likelihood are more appropriate; in non-boundary conditions, (g) germination data can be fitted with default pseudo-likelihood estimation techniques, on the basis of the SAS-based code templates provided here; (h) GzLMMs are remarkably good for the analysis of germination data except if some means are 0% or 100%. In this case, alternative statistical approaches may be used, such as survival analysis or linear mixed models (LMMs) with transformed data, unless an ad hoc data adjustment in estimates of limit means is considered, either experimentally or computationally. This review is intended as a basic tutorial for the application of GzLMMs, and is, therefore, of interest primarily to researchers in the agricultural sciences.
first_indexed 2024-04-11T22:20:35Z
format Article
id doaj.art-f82ee0dab26e481f84a3258d5e0dd406
institution Directory Open Access Journal
issn 2306-5729
language English
last_indexed 2024-04-11T22:20:35Z
publishDate 2020-01-01
publisher MDPI AG
record_format Article
series Data
spelling doaj.art-f82ee0dab26e481f84a3258d5e0dd4062022-12-22T04:00:13ZengMDPI AGData2306-57292020-01-0151610.3390/data5010006data5010006Basic Features of the Analysis of Germination Data with Generalized Linear Mixed ModelsAlberto Gianinetti0Council for Agricultural Research and Economics—Research Centre for Genomics and Bioinformatics, via S. Protaso 302, 29017 Fiorenzuola d’Arda (PC), ItalyGermination data are discrete and binomial. Although analysis of variance (ANOVA) has long been used for the statistical analysis of these data, generalized linear mixed models (GzLMMs) provide a more consistent theoretical framework. GzLMMs are suitable for final germination percentages (FGP) as well as longitudinal studies of germination time-courses. Germination indices (i.e., single-value parameters summarizing the results of a germination assay by combining the level and rapidity of germination) and other data with a Gaussian error distribution can be analyzed too. There are, however, different kinds of GzLMMs: Conditional (i.e., random effects are modeled as deviations from the general intercept with a specific covariance structure), marginal (i.e., random effects are modeled solely as a variance/covariance structure of the error terms), and quasi-marginal (some random effects are modeled as deviations from the intercept and some are modeled as a covariance structure of the error terms) models can be applied to the same data. It is shown that: (a) For germination data, conditional, marginal, and quasi-marginal GzLMMs tend to converge to a similar inference; (b) conditional models are the first choice for FGP; (c) marginal or quasi-marginal models are more suited for longitudinal studies, although conditional models lead to a congruent inference; (d) in general, common random factors are better dealt with as random intercepts, whereas serial correlation is easier to model in terms of the covariance structure of the error terms; (e) germination indices are not binomial and can be easier to analyze with a marginal model; (f) in boundary conditions (when some means approach 0% or 100%), conditional models with an integral approximation of true likelihood are more appropriate; in non-boundary conditions, (g) germination data can be fitted with default pseudo-likelihood estimation techniques, on the basis of the SAS-based code templates provided here; (h) GzLMMs are remarkably good for the analysis of germination data except if some means are 0% or 100%. In this case, alternative statistical approaches may be used, such as survival analysis or linear mixed models (LMMs) with transformed data, unless an ad hoc data adjustment in estimates of limit means is considered, either experimentally or computationally. This review is intended as a basic tutorial for the application of GzLMMs, and is, therefore, of interest primarily to researchers in the agricultural sciences.https://www.mdpi.com/2306-5729/5/1/6binomial datagermination testover-dispersionunder-dispersionrandom effects
spellingShingle Alberto Gianinetti
Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
Data
binomial data
germination test
over-dispersion
under-dispersion
random effects
title Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_full Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_fullStr Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_full_unstemmed Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_short Basic Features of the Analysis of Germination Data with Generalized Linear Mixed Models
title_sort basic features of the analysis of germination data with generalized linear mixed models
topic binomial data
germination test
over-dispersion
under-dispersion
random effects
url https://www.mdpi.com/2306-5729/5/1/6
work_keys_str_mv AT albertogianinetti basicfeaturesoftheanalysisofgerminationdatawithgeneralizedlinearmixedmodels