An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding
Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGP...
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Public Library of Science
2014
|
Online Access: | http://hdl.handle.net/1721.1/86086 https://orcid.org/0000-0002-5845-748X https://orcid.org/0000-0003-1709-4034 |
_version_ | 1811071964987523072 |
---|---|
author | Mahony, Shaun Edwards, Matthew Douglas Mazzoni, Esteban O. Sherwood, Richard I. Kakumanu, Akshay Morrison, Carolyn A. Wichterle, Hynek Gifford, David K. |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Mahony, Shaun Edwards, Matthew Douglas Mazzoni, Esteban O. Sherwood, Richard I. Kakumanu, Akshay Morrison, Carolyn A. Wichterle, Hynek Gifford, David K. |
author_sort | Mahony, Shaun |
collection | MIT |
description | Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across multiple experiments for binding event discovery. We demonstrate that our framework enables the simultaneous modeling of sparse condition-specific binding changes, sequence dependence, and replicate-specific noise sources. MultiGPS encourages consistency in reported binding event locations across multiple-condition ChIP-seq datasets and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS's multi-experiment modeling approach thus provides a reliable platform for detecting differential binding enrichment across experimental conditions. We demonstrate the advantages of MultiGPS with an analysis of Cdx2 binding in three distinct developmental contexts. By accurately characterizing condition-specific Cdx2 binding, MultiGPS enables novel insight into the mechanistic basis of Cdx2 site selectivity. Specifically, the condition-specific Cdx2 sites characterized by MultiGPS are highly associated with pre-existing genomic context, suggesting that such sites are pre-determined by cell-specific regulatory architecture. However, MultiGPS-defined condition-independent sites are not predicted by pre-existing regulatory signals, suggesting that Cdx2 can bind to a subset of locations regardless of genomic environment. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. |
first_indexed | 2024-09-23T08:58:45Z |
format | Article |
id | mit-1721.1/86086 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T08:58:45Z |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | dspace |
spelling | mit-1721.1/860862022-09-30T12:34:04Z An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding Mahony, Shaun Edwards, Matthew Douglas Mazzoni, Esteban O. Sherwood, Richard I. Kakumanu, Akshay Morrison, Carolyn A. Wichterle, Hynek Gifford, David K. Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Edwards, Matthew Douglas Gifford, David K. Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across multiple experiments for binding event discovery. We demonstrate that our framework enables the simultaneous modeling of sparse condition-specific binding changes, sequence dependence, and replicate-specific noise sources. MultiGPS encourages consistency in reported binding event locations across multiple-condition ChIP-seq datasets and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS's multi-experiment modeling approach thus provides a reliable platform for detecting differential binding enrichment across experimental conditions. We demonstrate the advantages of MultiGPS with an analysis of Cdx2 binding in three distinct developmental contexts. By accurately characterizing condition-specific Cdx2 binding, MultiGPS enables novel insight into the mechanistic basis of Cdx2 site selectivity. Specifically, the condition-specific Cdx2 sites characterized by MultiGPS are highly associated with pre-existing genomic context, suggesting that such sites are pre-determined by cell-specific regulatory architecture. However, MultiGPS-defined condition-independent sites are not predicted by pre-existing regulatory signals, suggesting that Cdx2 can bind to a subset of locations regardless of genomic environment. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2–5. National Science Foundation (U.S.) (Graduate Research Fellowship under Grant 0645960) National Institutes of Health (U.S.) (grant P01 NS055923) Pennsylvania State University. Center for Eukaryotic Gene Regulation 2014-04-09T19:50:58Z 2014-04-09T19:50:58Z 2014-03 2013-10 Article http://purl.org/eprint/type/JournalArticle 1553-7358 http://hdl.handle.net/1721.1/86086 Mahony, Shaun, Matthew D. Edwards, Esteban O. Mazzoni, Richard I. Sherwood, Akshay Kakumanu, Carolyn A. Morrison, Hynek Wichterle, and David K. Gifford. “An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding.” Edited by Ilya Ioshikhes. PLoS Comput Biol 10, no. 3 (March 27, 2014): e1003501. https://orcid.org/0000-0002-5845-748X https://orcid.org/0000-0003-1709-4034 en_US http://dx.doi.org/10.1371/journal.pcbi.1003501 PLoS Computational Biology Creative Commons Attribution http://creativecommons.org/licenses/by/4.0/ application/pdf Public Library of Science PLoS |
spellingShingle | Mahony, Shaun Edwards, Matthew Douglas Mazzoni, Esteban O. Sherwood, Richard I. Kakumanu, Akshay Morrison, Carolyn A. Wichterle, Hynek Gifford, David K. An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title | An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title_full | An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title_fullStr | An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title_full_unstemmed | An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title_short | An Integrated Model of Multiple-Condition ChIP-Seq Data Reveals Predeterminants of Cdx2 Binding |
title_sort | integrated model of multiple condition chip seq data reveals predeterminants of cdx2 binding |
url | http://hdl.handle.net/1721.1/86086 https://orcid.org/0000-0002-5845-748X https://orcid.org/0000-0003-1709-4034 |
work_keys_str_mv | AT mahonyshaun anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT edwardsmatthewdouglas anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT mazzoniestebano anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT sherwoodrichardi anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT kakumanuakshay anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT morrisoncarolyna anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT wichterlehynek anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT gifforddavidk anintegratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT mahonyshaun integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT edwardsmatthewdouglas integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT mazzoniestebano integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT sherwoodrichardi integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT kakumanuakshay integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT morrisoncarolyna integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT wichterlehynek integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding AT gifforddavidk integratedmodelofmultipleconditionchipseqdatarevealspredeterminantsofcdx2binding |