Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene

Abstract Background Gene co-expression, the similarity of gene expression profiles under various experimental conditions, has been used as an indicator of functional relationships between genes, and many co-expression databases have been developed for predicting gene functions. These databases usual...

Full description

Bibliographic Details
Main Authors: Takafumi Narise, Nozomu Sakurai, Takeshi Obayashi, Hiroyuki Ohta, Daisuke Shibata
Format: Article
Language:English
Published: BMC 2017-06-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-017-3786-3
_version_ 1818430329404260352
author Takafumi Narise
Nozomu Sakurai
Takeshi Obayashi
Hiroyuki Ohta
Daisuke Shibata
author_facet Takafumi Narise
Nozomu Sakurai
Takeshi Obayashi
Hiroyuki Ohta
Daisuke Shibata
author_sort Takafumi Narise
collection DOAJ
description Abstract Background Gene co-expression, the similarity of gene expression profiles under various experimental conditions, has been used as an indicator of functional relationships between genes, and many co-expression databases have been developed for predicting gene functions. These databases usually provide users with a co-expression network and a list of strongly co-expressed genes for a query gene. Several of these databases also provide functional information on a set of strongly co-expressed genes (i.e., provide biological processes and pathways that are enriched in these strongly co-expressed genes), which is generally analyzed via over-representation analysis (ORA). A limitation of this approach may be that users can predict gene functions only based on the strongly co-expressed genes. Results In this study, we developed a new co-expression database that enables users to predict the function of tomato genes from the results of functional enrichment analyses of co-expressed genes while considering the genes that are not strongly co-expressed. To achieve this, we used the ORA approach with several thresholds to select co-expressed genes, and performed gene set enrichment analysis (GSEA) applied to a ranked list of genes ordered by the co-expression degree. We found that internal correlation in pathways affected the significance levels of the enrichment analyses. Therefore, we introduced a new measure for evaluating the relationship between the gene and pathway, termed the percentile (p)-score, which enables users to predict functionally relevant pathways without being affected by the internal correlation in pathways. In addition, we evaluated our approaches using receiver operating characteristic curves, which concluded that the p-score could improve the performance of the ORA. Conclusions We developed a new database, named Co-expressed Pathways DataBase for Tomato, which is available at http://cox-path-db.kazusa.or.jp/tomato . The database allows users to predict pathways that are relevant to a query gene, which would help to infer gene functions.
first_indexed 2024-12-14T15:31:41Z
format Article
id doaj.art-812a0ca872534464b599f03c85db9d4d
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-14T15:31:41Z
publishDate 2017-06-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-812a0ca872534464b599f03c85db9d4d2022-12-21T22:55:51ZengBMCBMC Genomics1471-21642017-06-011811910.1186/s12864-017-3786-3Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query geneTakafumi Narise0Nozomu Sakurai1Takeshi Obayashi2Hiroyuki Ohta3Daisuke Shibata4Kazusa DNA Research InstituteKazusa DNA Research InstituteGraduate School of Information Sciences, Tohoku UniversityGraduate School of Bioscience and Biotechnology, Tokyo Institute of TechnologyKazusa DNA Research InstituteAbstract Background Gene co-expression, the similarity of gene expression profiles under various experimental conditions, has been used as an indicator of functional relationships between genes, and many co-expression databases have been developed for predicting gene functions. These databases usually provide users with a co-expression network and a list of strongly co-expressed genes for a query gene. Several of these databases also provide functional information on a set of strongly co-expressed genes (i.e., provide biological processes and pathways that are enriched in these strongly co-expressed genes), which is generally analyzed via over-representation analysis (ORA). A limitation of this approach may be that users can predict gene functions only based on the strongly co-expressed genes. Results In this study, we developed a new co-expression database that enables users to predict the function of tomato genes from the results of functional enrichment analyses of co-expressed genes while considering the genes that are not strongly co-expressed. To achieve this, we used the ORA approach with several thresholds to select co-expressed genes, and performed gene set enrichment analysis (GSEA) applied to a ranked list of genes ordered by the co-expression degree. We found that internal correlation in pathways affected the significance levels of the enrichment analyses. Therefore, we introduced a new measure for evaluating the relationship between the gene and pathway, termed the percentile (p)-score, which enables users to predict functionally relevant pathways without being affected by the internal correlation in pathways. In addition, we evaluated our approaches using receiver operating characteristic curves, which concluded that the p-score could improve the performance of the ORA. Conclusions We developed a new database, named Co-expressed Pathways DataBase for Tomato, which is available at http://cox-path-db.kazusa.or.jp/tomato . The database allows users to predict pathways that are relevant to a query gene, which would help to infer gene functions.http://link.springer.com/article/10.1186/s12864-017-3786-3Co-expression databasePathwayOver-representation analysisGene set enrichment analysisPercentile-score
spellingShingle Takafumi Narise
Nozomu Sakurai
Takeshi Obayashi
Hiroyuki Ohta
Daisuke Shibata
Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
BMC Genomics
Co-expression database
Pathway
Over-representation analysis
Gene set enrichment analysis
Percentile-score
title Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
title_full Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
title_fullStr Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
title_full_unstemmed Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
title_short Co-expressed Pathways DataBase for Tomato: a database to predict pathways relevant to a query gene
title_sort co expressed pathways database for tomato a database to predict pathways relevant to a query gene
topic Co-expression database
Pathway
Over-representation analysis
Gene set enrichment analysis
Percentile-score
url http://link.springer.com/article/10.1186/s12864-017-3786-3
work_keys_str_mv AT takafuminarise coexpressedpathwaysdatabasefortomatoadatabasetopredictpathwaysrelevanttoaquerygene
AT nozomusakurai coexpressedpathwaysdatabasefortomatoadatabasetopredictpathwaysrelevanttoaquerygene
AT takeshiobayashi coexpressedpathwaysdatabasefortomatoadatabasetopredictpathwaysrelevanttoaquerygene
AT hiroyukiohta coexpressedpathwaysdatabasefortomatoadatabasetopredictpathwaysrelevanttoaquerygene
AT daisukeshibata coexpressedpathwaysdatabasefortomatoadatabasetopredictpathwaysrelevanttoaquerygene