Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models

Abstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and e...

Full description

Bibliographic Details
Main Authors: Lianbo Yu, Soledad Fernandez, Guy Brock
Format: Article
Language:English
Published: BMC 2020-05-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-020-3541-7
_version_ 1818146916968431616
author Lianbo Yu
Soledad Fernandez
Guy Brock
author_facet Lianbo Yu
Soledad Fernandez
Guy Brock
author_sort Lianbo Yu
collection DOAJ
description Abstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. Results In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. Conclusions In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level.
first_indexed 2024-12-11T12:26:58Z
format Article
id doaj.art-bd6467c1005a4efbb8d0e3033db6033c
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T12:26:58Z
publishDate 2020-05-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-bd6467c1005a4efbb8d0e3033db6033c2022-12-22T01:07:22ZengBMCBMC Bioinformatics1471-21052020-05-0121111210.1186/s12859-020-3541-7Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects modelsLianbo Yu0Soledad Fernandez1Guy Brock2Center for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityCenter for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityCenter for Biostatistics, Department of Biomedical Informatics, The Ohio State UniversityAbstract Background Power analysis becomes an inevitable step in experimental design of current biomedical research. Complex designs allowing diverse correlation structures are commonly used in RNA-Seq experiments. However, the field currently lacks statistical methods to calculate sample size and estimate power for RNA-Seq differential expression studies using such designs. To fill the gap, simulation based methods have a great advantage by providing numerical solutions, since theoretical distributions of test statistics are typically unavailable for such designs. Results In this paper, we propose a novel simulation based procedure for power estimation of differential expression with the employment of generalized linear mixed effects models for correlated expression data. We also propose a new procedure for power estimation of differential expression with the use of a bivariate negative binomial distribution for paired designs. We compare the performance of both the likelihood ratio test and Wald test under a variety of simulation scenarios with the proposed procedures. The simulated distribution was used to estimate the null distribution of test statistics in order to achieve the desired false positive control and was compared to the asymptotic Chi-square distribution. In addition, we applied the procedure for paired designs to the TCGA breast cancer data set. Conclusions In summary, we provide a framework for power estimation of RNA-Seq differential expression under complex experimental designs. Simulation results demonstrate that both the proposed procedures properly control the false positive rate at the nominal level.http://link.springer.com/article/10.1186/s12859-020-3541-7RNA-SeqPower analysisBivariate negative binomialGeneralized linear mixed effects Model
spellingShingle Lianbo Yu
Soledad Fernandez
Guy Brock
Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
BMC Bioinformatics
RNA-Seq
Power analysis
Bivariate negative binomial
Generalized linear mixed effects Model
title Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
title_full Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
title_fullStr Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
title_full_unstemmed Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
title_short Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models
title_sort power analysis for rna seq differential expression studies using generalized linear mixed effects models
topic RNA-Seq
Power analysis
Bivariate negative binomial
Generalized linear mixed effects Model
url http://link.springer.com/article/10.1186/s12859-020-3541-7
work_keys_str_mv AT lianboyu poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels
AT soledadfernandez poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels
AT guybrock poweranalysisforrnaseqdifferentialexpressionstudiesusinggeneralizedlinearmixedeffectsmodels