Down-weighting overlapping genes improves gene set analysis

<p>Abstract</p> <p>Background</p> <p>The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how spec...

Full description

Bibliographic Details
Main Authors: Tarca Adi, Draghici Sorin, Bhatti Gaurav, Romero Roberto
Format: Article
Language:English
Published: BMC 2012-06-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://www.biomedcentral.com/1471-2105/13/136
_version_ 1818057646358396928
author Tarca Adi
Draghici Sorin
Bhatti Gaurav
Romero Roberto
author_facet Tarca Adi
Draghici Sorin
Bhatti Gaurav
Romero Roberto
author_sort Tarca Adi
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set.</p> <p>Results</p> <p>In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method <it><b>P</b>athway <b>A</b>nalysis with <b>D</b>own-weighting of <b>O</b>verlapping <b>G</b>enes</it> (<b>PADOG</b>). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results.</p> <p>Conclusions</p> <p>PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: <url>http://bioinformaticsprb.med.wayne.edu/PADOG/</url>or <url>http://www.bioconductor.org</url>.</p>
first_indexed 2024-12-10T12:48:03Z
format Article
id doaj.art-711f577868b840f7bd22f14624067a33
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-10T12:48:03Z
publishDate 2012-06-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-711f577868b840f7bd22f14624067a332022-12-22T01:48:21ZengBMCBMC Bioinformatics1471-21052012-06-0113113610.1186/1471-2105-13-136Down-weighting overlapping genes improves gene set analysisTarca AdiDraghici SorinBhatti GauravRomero Roberto<p>Abstract</p> <p>Background</p> <p>The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set.</p> <p>Results</p> <p>In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method <it><b>P</b>athway <b>A</b>nalysis with <b>D</b>own-weighting of <b>O</b>verlapping <b>G</b>enes</it> (<b>PADOG</b>). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results.</p> <p>Conclusions</p> <p>PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: <url>http://bioinformaticsprb.med.wayne.edu/PADOG/</url>or <url>http://www.bioconductor.org</url>.</p>http://www.biomedcentral.com/1471-2105/13/136Gene expressionGene set analysisPathway analysisOverlapping gene sets
spellingShingle Tarca Adi
Draghici Sorin
Bhatti Gaurav
Romero Roberto
Down-weighting overlapping genes improves gene set analysis
BMC Bioinformatics
Gene expression
Gene set analysis
Pathway analysis
Overlapping gene sets
title Down-weighting overlapping genes improves gene set analysis
title_full Down-weighting overlapping genes improves gene set analysis
title_fullStr Down-weighting overlapping genes improves gene set analysis
title_full_unstemmed Down-weighting overlapping genes improves gene set analysis
title_short Down-weighting overlapping genes improves gene set analysis
title_sort down weighting overlapping genes improves gene set analysis
topic Gene expression
Gene set analysis
Pathway analysis
Overlapping gene sets
url http://www.biomedcentral.com/1471-2105/13/136
work_keys_str_mv AT tarcaadi downweightingoverlappinggenesimprovesgenesetanalysis
AT draghicisorin downweightingoverlappinggenesimprovesgenesetanalysis
AT bhattigaurav downweightingoverlappinggenesimprovesgenesetanalysis
AT romeroroberto downweightingoverlappinggenesimprovesgenesetanalysis