Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments

Abstract Background In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no...

Full description

Bibliographic Details
Main Authors: Sonja Zehetmayer, Martin Posch, Alexandra Graf
Format: Article
Language:English
Published: BMC 2022-09-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-04928-z
_version_ 1811208838585516032
author Sonja Zehetmayer
Martin Posch
Alexandra Graf
author_facet Sonja Zehetmayer
Martin Posch
Alexandra Graf
author_sort Sonja Zehetmayer
collection DOAJ
description Abstract Background In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no chance of showing a difference between conditions. This step may reduce the multiple testing burden and increase power. Results We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small. Conclusions For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario.
first_indexed 2024-04-12T04:28:28Z
format Article
id doaj.art-fb014996a90a4fc588823b2cd38681bb
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-12T04:28:28Z
publishDate 2022-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-fb014996a90a4fc588823b2cd38681bb2022-12-22T03:48:00ZengBMCBMC Bioinformatics1471-21052022-09-0123111610.1186/s12859-022-04928-zImpact of adaptive filtering on power and false discovery rate in RNA-seq experimentsSonja Zehetmayer0Martin Posch1Alexandra Graf2Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of ViennaCenter for Medical Statistics, Informatics, and Intelligent Systems, Medical University of ViennaCenter for Medical Statistics, Informatics, and Intelligent Systems, Medical University of ViennaAbstract Background In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no chance of showing a difference between conditions. This step may reduce the multiple testing burden and increase power. Results We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small. Conclusions For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario.https://doi.org/10.1186/s12859-022-04928-zNext generation sequencingGene expressionMultiple testingGene filter
spellingShingle Sonja Zehetmayer
Martin Posch
Alexandra Graf
Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
BMC Bioinformatics
Next generation sequencing
Gene expression
Multiple testing
Gene filter
title Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
title_full Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
title_fullStr Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
title_full_unstemmed Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
title_short Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments
title_sort impact of adaptive filtering on power and false discovery rate in rna seq experiments
topic Next generation sequencing
Gene expression
Multiple testing
Gene filter
url https://doi.org/10.1186/s12859-022-04928-z
work_keys_str_mv AT sonjazehetmayer impactofadaptivefilteringonpowerandfalsediscoveryrateinrnaseqexperiments
AT martinposch impactofadaptivefilteringonpowerandfalsediscoveryrateinrnaseqexperiments
AT alexandragraf impactofadaptivefilteringonpowerandfalsediscoveryrateinrnaseqexperiments