Summary: | When analyzing the data arising from exome or whole-genome sequencing studies, window-based tests, i.e. tests that jointly analyze all genetic data in a small genomic region, are very popular. However, power is known to be quite low for finding associations with phenotypes using these tests, and hence a variety of analytic strategies may be employed to potentially improve power. Using sequencing data from all of chromosome 3 in an interim release of data on 2,432 individuals from the UK10K project, we simulated phenotypes associated with rare genetic variation, and used the results to explore the window-based test power, and to ask two specific questions. Firstly, we asked whether there could be substantial benefits associated with incorporating information from external annotation on the genetic variants, and secondly we asked whether the false discovery rate (FDRs) would be a useful metric for assessing significance. Although, as expected, there are benefits to using additional information (such as annotation) when it is associated with causality, we confirmed the general pattern of low sensitivity and power for window-based tests. At least for our chosen example, even when power is high to detect some associations, many of the regions containing causal variants cannot be detected, despite using lax significance thresholds and optimal analytic methods. Furthermore, our estimated FDR values tended to be much smaller than the true FDRs. Long-range correlations between variants—due to linkage disequilibrium—likely explains some of this bias. A more sophisticated approach to using the annotation information may help the power, but many causal variants of realistic effect sizes may simply be undetectable, at least with this sample size. Perhaps annotation information could assist in distinguishing windows containing causal variants from windows that are merely correlated with causal variants.
|