Power in pairs: assessing the statistical value of paired samples in tests for differential expression

Abstract Background When genomics researchers design a high-throughput study to test for differential expression, some biological systems and research questions provide opportunities to use paired samples from subjects, and researchers can plan for a certain proportion of subjects to have paired sam...

Full description

Bibliographic Details
Main Authors: John R. Stevens, Jennifer S. Herrick, Roger K. Wolff, Martha L. Slattery
Format: Article
Language:English
Published: BMC 2018-12-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-018-5236-2
_version_ 1818684247149379584
author John R. Stevens
Jennifer S. Herrick
Roger K. Wolff
Martha L. Slattery
author_facet John R. Stevens
Jennifer S. Herrick
Roger K. Wolff
Martha L. Slattery
author_sort John R. Stevens
collection DOAJ
description Abstract Background When genomics researchers design a high-throughput study to test for differential expression, some biological systems and research questions provide opportunities to use paired samples from subjects, and researchers can plan for a certain proportion of subjects to have paired samples. We consider the effect of this paired samples proportion on the statistical power of the study, using characteristics of both count (RNA-Seq) and continuous (microarray) expression data from a colorectal cancer study. Results We demonstrate that a higher proportion of subjects with paired samples yields higher statistical power, for various total numbers of samples, and for various strengths of subject-level confounding factors. In the design scenarios considered, the statistical power in a fully-paired design is substantially (and in many cases several times) greater than in an unpaired design. Conclusions For the many biological systems and research questions where paired samples are feasible and relevant, substantial statistical power gains can be achieved at the study design stage when genomics researchers plan on using paired samples from the largest possible proportion of subjects. Any cost savings in a study design with unpaired samples are likely accompanied by underpowered and possibly biased results.
first_indexed 2024-12-17T10:47:36Z
format Article
id doaj.art-f3e41dae97774203989294f3cf127fcb
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-17T10:47:36Z
publishDate 2018-12-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-f3e41dae97774203989294f3cf127fcb2022-12-21T21:52:05ZengBMCBMC Genomics1471-21642018-12-0119111310.1186/s12864-018-5236-2Power in pairs: assessing the statistical value of paired samples in tests for differential expressionJohn R. Stevens0Jennifer S. Herrick1Roger K. Wolff2Martha L. Slattery3Department of Mathematics and Statistics, Utah State UniversityDivision of Epidemiology, Department of Internal Medicine, University of UtahDivision of Epidemiology, Department of Internal Medicine, University of UtahDivision of Epidemiology, Department of Internal Medicine, University of UtahAbstract Background When genomics researchers design a high-throughput study to test for differential expression, some biological systems and research questions provide opportunities to use paired samples from subjects, and researchers can plan for a certain proportion of subjects to have paired samples. We consider the effect of this paired samples proportion on the statistical power of the study, using characteristics of both count (RNA-Seq) and continuous (microarray) expression data from a colorectal cancer study. Results We demonstrate that a higher proportion of subjects with paired samples yields higher statistical power, for various total numbers of samples, and for various strengths of subject-level confounding factors. In the design scenarios considered, the statistical power in a fully-paired design is substantially (and in many cases several times) greater than in an unpaired design. Conclusions For the many biological systems and research questions where paired samples are feasible and relevant, substantial statistical power gains can be achieved at the study design stage when genomics researchers plan on using paired samples from the largest possible proportion of subjects. Any cost savings in a study design with unpaired samples are likely accompanied by underpowered and possibly biased results.http://link.springer.com/article/10.1186/s12864-018-5236-2Study designStatistical powerRNA-SeqMicroarraymicroRNA
spellingShingle John R. Stevens
Jennifer S. Herrick
Roger K. Wolff
Martha L. Slattery
Power in pairs: assessing the statistical value of paired samples in tests for differential expression
BMC Genomics
Study design
Statistical power
RNA-Seq
Microarray
microRNA
title Power in pairs: assessing the statistical value of paired samples in tests for differential expression
title_full Power in pairs: assessing the statistical value of paired samples in tests for differential expression
title_fullStr Power in pairs: assessing the statistical value of paired samples in tests for differential expression
title_full_unstemmed Power in pairs: assessing the statistical value of paired samples in tests for differential expression
title_short Power in pairs: assessing the statistical value of paired samples in tests for differential expression
title_sort power in pairs assessing the statistical value of paired samples in tests for differential expression
topic Study design
Statistical power
RNA-Seq
Microarray
microRNA
url http://link.springer.com/article/10.1186/s12864-018-5236-2
work_keys_str_mv AT johnrstevens powerinpairsassessingthestatisticalvalueofpairedsamplesintestsfordifferentialexpression
AT jennifersherrick powerinpairsassessingthestatisticalvalueofpairedsamplesintestsfordifferentialexpression
AT rogerkwolff powerinpairsassessingthestatisticalvalueofpairedsamplesintestsfordifferentialexpression
AT marthalslattery powerinpairsassessingthestatisticalvalueofpairedsamplesintestsfordifferentialexpression