Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation

Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional fa...

Full description

Bibliographic Details
Main Authors: Jia, Hui., Li, Jinming.
Other Authors: School of Biological Sciences
Format: Journal Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/99457
http://hdl.handle.net/10220/17264
_version_ 1824453642772545536
author Jia, Hui.
Li, Jinming.
author2 School of Biological Sciences
author_facet School of Biological Sciences
Jia, Hui.
Li, Jinming.
author_sort Jia, Hui.
collection NTU
description Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional factor binding sites (TFBSs). We proposed here a method that combines sequence overrepresentation and cross-species sequence conservation to detect TFBSs in upstream regions of a given set of coregulated genes. We applied the method to 35 S. cerevisiae transcriptional factors with known DNA binding motifs (with the support of orthologous sequences from genomes of S. mikatae, S. bayanus, and S. paradoxus), and the proposed method outperformed the single-genome-based motif finding methods MEME and AlignACE as well as the multiple-genome-based methods PHYME and Footprinter for the majority of these transcriptional factors. Compared with the prevailing motif finding software, our method has some advantages in finding transcriptional factor binding motifs for potential coregulated genes if the gene upstream sequences of multiple closely related species are available. Although we used yeast genomes to assess our method in this study, it might also be applied to other organisms if suitable related species are available and the upstream sequences of coregulated genes can be obtained for the multiple closely related species.
first_indexed 2025-02-19T03:09:40Z
format Journal Article
id ntu-10356/99457
institution Nanyang Technological University
language English
last_indexed 2025-02-19T03:09:40Z
publishDate 2013
record_format dspace
spelling ntu-10356/994572023-02-28T17:04:32Z Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation Jia, Hui. Li, Jinming. School of Biological Sciences DRNTU::Science::Biological sciences Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional factor binding sites (TFBSs). We proposed here a method that combines sequence overrepresentation and cross-species sequence conservation to detect TFBSs in upstream regions of a given set of coregulated genes. We applied the method to 35 S. cerevisiae transcriptional factors with known DNA binding motifs (with the support of orthologous sequences from genomes of S. mikatae, S. bayanus, and S. paradoxus), and the proposed method outperformed the single-genome-based motif finding methods MEME and AlignACE as well as the multiple-genome-based methods PHYME and Footprinter for the majority of these transcriptional factors. Compared with the prevailing motif finding software, our method has some advantages in finding transcriptional factor binding motifs for potential coregulated genes if the gene upstream sequences of multiple closely related species are available. Although we used yeast genomes to assess our method in this study, it might also be applied to other organisms if suitable related species are available and the upstream sequences of coregulated genes can be obtained for the multiple closely related species. Published version 2013-11-05T05:36:03Z 2019-12-06T20:07:42Z 2013-11-05T05:36:03Z 2019-12-06T20:07:42Z 2012 2012 Journal Article Jia, H., & Li, J. (2012). Finding Transcription Factor Binding Motifs for Coregulated Genes by Combining Sequence Overrepresentation with Cross-Species Conservation. Journal of Probability and Statistics, 2012,1-18. https://hdl.handle.net/10356/99457 http://hdl.handle.net/10220/17264 10.1155/2012/830575 en Journal of probability and statistics © 2012 The Authors. This paper was published in Journal of Probability and Statistics and is made available as an electronic reprint (preprint) with permission of the authors. The paper can be found at the following official DOI: [http://dx.doi.org/10.1155/2012/830575]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. application/pdf
spellingShingle DRNTU::Science::Biological sciences
Jia, Hui.
Li, Jinming.
Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title_full Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title_fullStr Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title_full_unstemmed Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title_short Finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross-species conservation
title_sort finding transcription factor binding motifs for coregulated genes by combining sequence overrepresentation with cross species conservation
topic DRNTU::Science::Biological sciences
url https://hdl.handle.net/10356/99457
http://hdl.handle.net/10220/17264
work_keys_str_mv AT jiahui findingtranscriptionfactorbindingmotifsforcoregulatedgenesbycombiningsequenceoverrepresentationwithcrossspeciesconservation
AT lijinming findingtranscriptionfactorbindingmotifsforcoregulatedgenesbycombiningsequenceoverrepresentationwithcrossspeciesconservation