Ultra-fast evaluation of protein energies directly from sequence.

The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbon...

Full description

Bibliographic Details
Main Authors: Gevorg Grigoryan, Fei Zhou, Steve R Lustig, Gerbrand Ceder, Dane Morgan, Amy E Keating
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2006-06-01
Series:PLoS Computational Biology
Online Access:http://europepmc.org/articles/PMC1479088?pdf=render
_version_ 1828492170410065920
author Gevorg Grigoryan
Fei Zhou
Steve R Lustig
Gerbrand Ceder
Dane Morgan
Amy E Keating
author_facet Gevorg Grigoryan
Fei Zhou
Steve R Lustig
Gerbrand Ceder
Dane Morgan
Amy E Keating
author_sort Gevorg Grigoryan
collection DOAJ
description The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7) compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1-4.7 kcal/mol, R2 = 0.7-1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets-a coiled coil, a zinc finger, and a WW domain-as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling.
first_indexed 2024-12-11T11:15:51Z
format Article
id doaj.art-06b84c020132469b99bece16bd8fa07f
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-12-11T11:15:51Z
publishDate 2006-06-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-06b84c020132469b99bece16bd8fa07f2022-12-22T01:09:19ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582006-06-0126e6310.1371/journal.pcbi.0020063Ultra-fast evaluation of protein energies directly from sequence.Gevorg GrigoryanFei ZhouSteve R LustigGerbrand CederDane MorganAmy E KeatingThe structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7) compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1-4.7 kcal/mol, R2 = 0.7-1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets-a coiled coil, a zinc finger, and a WW domain-as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling.http://europepmc.org/articles/PMC1479088?pdf=render
spellingShingle Gevorg Grigoryan
Fei Zhou
Steve R Lustig
Gerbrand Ceder
Dane Morgan
Amy E Keating
Ultra-fast evaluation of protein energies directly from sequence.
PLoS Computational Biology
title Ultra-fast evaluation of protein energies directly from sequence.
title_full Ultra-fast evaluation of protein energies directly from sequence.
title_fullStr Ultra-fast evaluation of protein energies directly from sequence.
title_full_unstemmed Ultra-fast evaluation of protein energies directly from sequence.
title_short Ultra-fast evaluation of protein energies directly from sequence.
title_sort ultra fast evaluation of protein energies directly from sequence
url http://europepmc.org/articles/PMC1479088?pdf=render
work_keys_str_mv AT gevorggrigoryan ultrafastevaluationofproteinenergiesdirectlyfromsequence
AT feizhou ultrafastevaluationofproteinenergiesdirectlyfromsequence
AT steverlustig ultrafastevaluationofproteinenergiesdirectlyfromsequence
AT gerbrandceder ultrafastevaluationofproteinenergiesdirectlyfromsequence
AT danemorgan ultrafastevaluationofproteinenergiesdirectlyfromsequence
AT amyekeating ultrafastevaluationofproteinenergiesdirectlyfromsequence