Sample size for detecting differentially expressed genes in microarray experiments

<p>Abstract</p> <p>Background</p> <p>Microarray experiments are often performed with a small number of biological replicates, resulting in low statistical power for detecting differentially expressed genes and concomitant high false positive rates. While increasing samp...

Full description

Bibliographic Details
Main Authors: Li Jiangning, Wei Caimiao, Bumgarner Roger E
Format: Article
Language:English
Published: BMC 2004-11-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/5/87
_version_ 1818834106660683776
author Li Jiangning
Wei Caimiao
Bumgarner Roger E
author_facet Li Jiangning
Wei Caimiao
Bumgarner Roger E
author_sort Li Jiangning
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Microarray experiments are often performed with a small number of biological replicates, resulting in low statistical power for detecting differentially expressed genes and concomitant high false positive rates. While increasing sample size can increase statistical power and decrease error rates, with too many samples, valuable resources are not used efficiently. The issue of how many replicates are required in a typical experimental system needs to be addressed. Of particular interest is the difference in required sample sizes for similar experiments in inbred vs. outbred populations (e.g. mouse and rat vs. human).</p> <p>Results</p> <p>We hypothesize that if all other factors (assay protocol, microarray platform, data pre-processing) were equal, fewer individuals would be needed for the same statistical power using inbred animals as opposed to unrelated human subjects, as genetic effects on gene expression will be removed in the inbred populations. We apply the same normalization algorithm and estimate the variance of gene expression for a variety of cDNA data sets (humans, inbred mice and rats) comparing two conditions. Using one sample, paired sample or two independent sample t-tests, we calculate the sample sizes required to detect a 1.5-, 2-, and 4-fold changes in expression level as a function of false positive rate, power and percentage of genes that have a standard deviation below a given percentile.</p> <p>Conclusions</p> <p>Factors that affect power and sample size calculations include variability of the population, the desired detectable differences, the power to detect the differences, and an acceptable error rate. In addition, experimental design, technical variability and data pre-processing play a role in the power of the statistical tests in microarrays. We show that the number of samples required for detecting a 2-fold change with 90% probability and a p-value of 0.01 in humans is much larger than the number of samples commonly used in present day studies, and that far fewer individuals are needed for the same statistical power when using inbred animals rather than unrelated human subjects.</p>
first_indexed 2024-12-19T02:29:33Z
format Article
id doaj.art-21dbec9516f445aaba962ac7f100f4a7
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-19T02:29:33Z
publishDate 2004-11-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-21dbec9516f445aaba962ac7f100f4a72022-12-21T20:39:41ZengBMCBMC Genomics1471-21642004-11-01518710.1186/1471-2164-5-87Sample size for detecting differentially expressed genes in microarray experimentsLi JiangningWei CaimiaoBumgarner Roger E<p>Abstract</p> <p>Background</p> <p>Microarray experiments are often performed with a small number of biological replicates, resulting in low statistical power for detecting differentially expressed genes and concomitant high false positive rates. While increasing sample size can increase statistical power and decrease error rates, with too many samples, valuable resources are not used efficiently. The issue of how many replicates are required in a typical experimental system needs to be addressed. Of particular interest is the difference in required sample sizes for similar experiments in inbred vs. outbred populations (e.g. mouse and rat vs. human).</p> <p>Results</p> <p>We hypothesize that if all other factors (assay protocol, microarray platform, data pre-processing) were equal, fewer individuals would be needed for the same statistical power using inbred animals as opposed to unrelated human subjects, as genetic effects on gene expression will be removed in the inbred populations. We apply the same normalization algorithm and estimate the variance of gene expression for a variety of cDNA data sets (humans, inbred mice and rats) comparing two conditions. Using one sample, paired sample or two independent sample t-tests, we calculate the sample sizes required to detect a 1.5-, 2-, and 4-fold changes in expression level as a function of false positive rate, power and percentage of genes that have a standard deviation below a given percentile.</p> <p>Conclusions</p> <p>Factors that affect power and sample size calculations include variability of the population, the desired detectable differences, the power to detect the differences, and an acceptable error rate. In addition, experimental design, technical variability and data pre-processing play a role in the power of the statistical tests in microarrays. We show that the number of samples required for detecting a 2-fold change with 90% probability and a p-value of 0.01 in humans is much larger than the number of samples commonly used in present day studies, and that far fewer individuals are needed for the same statistical power when using inbred animals rather than unrelated human subjects.</p>http://www.biomedcentral.com/1471-2164/5/87
spellingShingle Li Jiangning
Wei Caimiao
Bumgarner Roger E
Sample size for detecting differentially expressed genes in microarray experiments
BMC Genomics
title Sample size for detecting differentially expressed genes in microarray experiments
title_full Sample size for detecting differentially expressed genes in microarray experiments
title_fullStr Sample size for detecting differentially expressed genes in microarray experiments
title_full_unstemmed Sample size for detecting differentially expressed genes in microarray experiments
title_short Sample size for detecting differentially expressed genes in microarray experiments
title_sort sample size for detecting differentially expressed genes in microarray experiments
url http://www.biomedcentral.com/1471-2164/5/87
work_keys_str_mv AT lijiangning samplesizefordetectingdifferentiallyexpressedgenesinmicroarrayexperiments
AT weicaimiao samplesizefordetectingdifferentiallyexpressedgenesinmicroarrayexperiments
AT bumgarnerrogere samplesizefordetectingdifferentiallyexpressedgenesinmicroarrayexperiments