Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
Abstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and e...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2020-05-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-020-3541-7 |
_version_ | 1818146916968431616 |
---|---|
author | Lianbo Yu Soledad Fernandez Guy Brock |
author_facet | Lianbo Yu Soledad Fernandez Guy Brock |
author_sort | Lianbo Yu |
collection | DOAJ |
description | Abstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. Results In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. Conclusions In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level. |
first_indexed | 2024-12-11T12:26:58Z |
format | Article |
id | doaj.art-bd6467c1005a4efbb8d0e3033db6033c |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-11T12:26:58Z |
publishDate | 2020-05-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-bd6467c1005a4efbb8d0e3033db6033c2022-12-22T01:07:22ZengBMCBMC Bioinformatics1471-21052020-05-0121111210.1186/s12859-020-3541-7Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects modelsLianbo Yu0Soledad Fernandez1Guy Brock2Center for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityCenter for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityCenter for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityAbstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. Results In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. Conclusions In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level.http://link.springer.com/article/10.1186/s12859-020-3541-7RNA-SeqPower analysisBivariate negative binomialGeneralized linear mixed effects Model |
spellingShingle | Lianbo Yu Soledad Fernandez Guy Brock Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models BMC Bioinformatics RNA-Seq Power analysis Bivariate negative binomial Generalized linear mixed effects Model |
title | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_full | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_fullStr | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_full_unstemmed | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_short | Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models |
title_sort | power analysis for rna seq differential expression studies using generalized linear mixed effects models |
topic | RNA-Seq Power analysis Bivariate negative binomial Generalized linear mixed effects Model |
url | http://link.springer.com/article/10.1186/s12859-020-3541-7 |
work_keys_str_mv | AT lianboyu poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels AT soledadfernandez poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels AT guybrock poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels |