Relative power and sample size analysis on gene expression profiling data

<p>Abstract</p> <p>Background</p> <p>With the increasing number of expression profiling technologies, researchers today are confronted with choosing the technology that has sufficient power with minimal sample size, in order to reduce cost and time. These depend on data...

Full description

Bibliographic Details
Main Authors: den Dunnen JT, Hooiveld GJEJ, Pedotti P, 't Hoen PAC, van Iterson M, van Ommen GJB, Boer JM, Menezes RX
Format: Article
Language:English
Published: BMC 2009-09-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/10/439
_version_ 1818472673988050944
author den Dunnen JT
Hooiveld GJEJ
Pedotti P
't Hoen PAC
van Iterson M
van Ommen GJB
Boer JM
Menezes RX
author_facet den Dunnen JT
Hooiveld GJEJ
Pedotti P
't Hoen PAC
van Iterson M
van Ommen GJB
Boer JM
Menezes RX
author_sort den Dunnen JT
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>With the increasing number of expression profiling technologies, researchers today are confronted with choosing the technology that has sufficient power with minimal sample size, in order to reduce cost and time. These depend on data variability, partly determined by sample type, preparation and processing. Objective measures that help experimental design, given own pilot data, are thus fundamental.</p> <p>Results</p> <p>Relative power and sample size analysis were performed on two distinct data sets. The first set consisted of Affymetrix array data derived from a nutrigenomics experiment in which weak, intermediate and strong PPAR<it>α </it>agonists were administered to wild-type and PPAR<it>α</it>-null mice. Our analysis confirms the hierarchy of PPAR<it>α</it>-activating compounds previously reported and the general idea that larger effect sizes positively contribute to the average power of the experiment. A simulation experiment was performed that mimicked the effect sizes seen in the first data set. The relative power was predicted but the estimates were slightly conservative. The second, more challenging, data set describes a microarray platform comparison study using hippocampal <it>δ</it>C-doublecortin-like kinase transgenic mice that were compared to wild-type mice, which was combined with results from Solexa/Illumina deep sequencing runs. As expected, the choice of technology greatly influences the performance of the experiment. Solexa/Illumina deep sequencing has the highest overall power followed by the microarray platforms Agilent and Affymetrix. Interestingly, Solexa/Illumina deep sequencing displays comparable power across all intensity ranges, in contrast with microarray platforms that have decreased power in the low intensity range due to background noise. This means that deep sequencing technology is especially more powerful in detecting differences in the low intensity range, compared to microarray platforms.</p> <p>Conclusion</p> <p>Power and sample size analysis based on pilot data give valuable information on the performance of the experiment and can thereby guide further decisions on experimental design. Solexa/Illumina deep sequencing is the technology of choice if interest lies in genes expressed in the low-intensity range. Researchers can get guidance on experimental design using our approach on their own pilot data implemented as a BioConductor package, SSPA <url>http://bioconductor.org/packages/release/bioc/html/SSPA.html</url>.</p>
first_indexed 2024-04-14T04:10:45Z
format Article
id doaj.art-7d9de50073d54c9c8538095e7b7f4a72
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-04-14T04:10:45Z
publishDate 2009-09-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-7d9de50073d54c9c8538095e7b7f4a722022-12-22T02:13:10ZengBMCBMC Genomics1471-21642009-09-0110143910.1186/1471-2164-10-439Relative power and sample size analysis on gene expression profiling dataden Dunnen JTHooiveld GJEJPedotti P't Hoen PACvan Iterson Mvan Ommen GJBBoer JMMenezes RX<p>Abstract</p> <p>Background</p> <p>With the increasing number of expression profiling technologies, researchers today are confronted with choosing the technology that has sufficient power with minimal sample size, in order to reduce cost and time. These depend on data variability, partly determined by sample type, preparation and processing. Objective measures that help experimental design, given own pilot data, are thus fundamental.</p> <p>Results</p> <p>Relative power and sample size analysis were performed on two distinct data sets. The first set consisted of Affymetrix array data derived from a nutrigenomics experiment in which weak, intermediate and strong PPAR<it>α </it>agonists were administered to wild-type and PPAR<it>α</it>-null mice. Our analysis confirms the hierarchy of PPAR<it>α</it>-activating compounds previously reported and the general idea that larger effect sizes positively contribute to the average power of the experiment. A simulation experiment was performed that mimicked the effect sizes seen in the first data set. The relative power was predicted but the estimates were slightly conservative. The second, more challenging, data set describes a microarray platform comparison study using hippocampal <it>δ</it>C-doublecortin-like kinase transgenic mice that were compared to wild-type mice, which was combined with results from Solexa/Illumina deep sequencing runs. As expected, the choice of technology greatly influences the performance of the experiment. Solexa/Illumina deep sequencing has the highest overall power followed by the microarray platforms Agilent and Affymetrix. Interestingly, Solexa/Illumina deep sequencing displays comparable power across all intensity ranges, in contrast with microarray platforms that have decreased power in the low intensity range due to background noise. This means that deep sequencing technology is especially more powerful in detecting differences in the low intensity range, compared to microarray platforms.</p> <p>Conclusion</p> <p>Power and sample size analysis based on pilot data give valuable information on the performance of the experiment and can thereby guide further decisions on experimental design. Solexa/Illumina deep sequencing is the technology of choice if interest lies in genes expressed in the low-intensity range. Researchers can get guidance on experimental design using our approach on their own pilot data implemented as a BioConductor package, SSPA <url>http://bioconductor.org/packages/release/bioc/html/SSPA.html</url>.</p>http://www.biomedcentral.com/1471-2164/10/439
spellingShingle den Dunnen JT
Hooiveld GJEJ
Pedotti P
't Hoen PAC
van Iterson M
van Ommen GJB
Boer JM
Menezes RX
Relative power and sample size analysis on gene expression profiling data
BMC Genomics
title Relative power and sample size analysis on gene expression profiling data
title_full Relative power and sample size analysis on gene expression profiling data
title_fullStr Relative power and sample size analysis on gene expression profiling data
title_full_unstemmed Relative power and sample size analysis on gene expression profiling data
title_short Relative power and sample size analysis on gene expression profiling data
title_sort relative power and sample size analysis on gene expression profiling data
url http://www.biomedcentral.com/1471-2164/10/439
work_keys_str_mv AT dendunnenjt relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT hooiveldgjej relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT pedottip relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT thoenpac relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT vanitersonm relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT vanommengjb relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT boerjm relativepowerandsamplesizeanalysisongeneexpressionprofilingdata
AT menezesrx relativepowerandsamplesizeanalysisongeneexpressionprofilingdata