An integrated approach to identifying cis-regulatory modules in the human genome.

In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the...

Full description

Bibliographic Details
Main Authors: Kyoung-Jae Won, Saurabh Agarwal, Li Shen, Robert Shoemaker, Bing Ren, Wei Wang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2009-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC2677454?pdf=render
_version_ 1828375419935522816
author Kyoung-Jae Won
Saurabh Agarwal
Li Shen
Robert Shoemaker
Bing Ren
Wei Wang
author_facet Kyoung-Jae Won
Saurabh Agarwal
Li Shen
Robert Shoemaker
Bing Ren
Wei Wang
author_sort Kyoung-Jae Won
collection DOAJ
description In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/.
first_indexed 2024-04-14T07:46:24Z
format Article
id doaj.art-b6a527048043436ba4e25cba2bbbb5b2
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-14T07:46:24Z
publishDate 2009-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-b6a527048043436ba4e25cba2bbbb5b22022-12-22T02:05:19ZengPublic Library of Science (PLoS)PLoS ONE1932-62032009-01-0145e550110.1371/journal.pone.0005501An integrated approach to identifying cis-regulatory modules in the human genome.Kyoung-Jae WonSaurabh AgarwalLi ShenRobert ShoemakerBing RenWei WangIn eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/.http://europepmc.org/articles/PMC2677454?pdf=render
spellingShingle Kyoung-Jae Won
Saurabh Agarwal
Li Shen
Robert Shoemaker
Bing Ren
Wei Wang
An integrated approach to identifying cis-regulatory modules in the human genome.
PLoS ONE
title An integrated approach to identifying cis-regulatory modules in the human genome.
title_full An integrated approach to identifying cis-regulatory modules in the human genome.
title_fullStr An integrated approach to identifying cis-regulatory modules in the human genome.
title_full_unstemmed An integrated approach to identifying cis-regulatory modules in the human genome.
title_short An integrated approach to identifying cis-regulatory modules in the human genome.
title_sort integrated approach to identifying cis regulatory modules in the human genome
url http://europepmc.org/articles/PMC2677454?pdf=render
work_keys_str_mv AT kyoungjaewon anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT saurabhagarwal anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT lishen anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT robertshoemaker anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT bingren anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT weiwang anintegratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT kyoungjaewon integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT saurabhagarwal integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT lishen integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT robertshoemaker integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT bingren integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome
AT weiwang integratedapproachtoidentifyingcisregulatorymodulesinthehumangenome