Predicting transcription factor specificity with all-atom models

The binding of a transcription factor (TF) to a DNA operator site can initiate or repress the expression of a gene. Computational prediction of sites recognized by a TF has traditionally relied upon knowledge of several cognate sites, rather than an ab initio approach. Here, we examine the possibili...

Full description

Bibliographic Details
Main Authors: Virnau, Peter, Rahi, Sahand Jamal, Mirny, Leonid A., Kardar, Mehran
Other Authors: Harvard University--MIT Division of Health Sciences and Technology
Format: Article
Language:en_US
Published: Oxford University Press (OUP) 2012
Online Access:http://hdl.handle.net/1721.1/70954
https://orcid.org/0000-0002-0785-5410
https://orcid.org/0000-0002-1112-5912
Description
Summary:The binding of a transcription factor (TF) to a DNA operator site can initiate or repress the expression of a gene. Computational prediction of sites recognized by a TF has traditionally relied upon knowledge of several cognate sites, rather than an ab initio approach. Here, we examine the possibility of using structure-based energy calculations that require no knowledge of bound sites but rather start with the structure of a protein–DNA complex. We study the PurR Escherichia coli TF, and explore to which extent atomistic models of protein–DNA complexes can be used to distinguish between cognate and noncognate DNA sites. Particular emphasis is placed on systematic evaluation of this approach by comparing its performance with bioinformatic methods, by testing it against random decoys and sites of homologous TFs. We also examine a set of experimental mutations in both DNA and the protein. Using our explicit estimates of energy, we show that the specificity for PurR is dominated by direct protein–DNA interactions, and weakly influenced by bending of DNA.