Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds

Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the sol...

Full description

Bibliographic Details
Main Authors: Soundararajan, Venkataramanan, Raman, Rahul, Raguram, S, Sasisekharan, Viswanathan, Sasisekharan, Ram
Other Authors: Massachusetts Institute of Technology. Department of Biology
Format: Article
Published: Public Library of Science (PLoS) 2018
Online Access:http://hdl.handle.net/1721.1/116196
https://orcid.org/0000-0002-2085-7840
_version_ 1826204752413720576
author Soundararajan, Venkataramanan
Raman, Rahul
Raguram, S
Sasisekharan, Viswanathan
Sasisekharan, Ram
author2 Massachusetts Institute of Technology. Department of Biology
author_facet Massachusetts Institute of Technology. Department of Biology
Soundararajan, Venkataramanan
Raman, Rahul
Raguram, S
Sasisekharan, Viswanathan
Sasisekharan, Ram
author_sort Soundararajan, Venkataramanan
collection MIT
description Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein < 30% and < 10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plaguecausative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools.
first_indexed 2024-09-23T13:00:51Z
format Article
id mit-1721.1/116196
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T13:00:51Z
publishDate 2018
publisher Public Library of Science (PLoS)
record_format dspace
spelling mit-1721.1/1161962022-09-28T11:28:32Z Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds Soundararajan, Venkataramanan Raman, Rahul Raguram, S Sasisekharan, Viswanathan Sasisekharan, Ram Massachusetts Institute of Technology. Department of Biology Koch Institute for Integrative Cancer Research at MIT Soundararajan, Venkataramanan Raman, Rahul Raguram, S Sasisekharan, Viswanathan Sasisekharan, Ram Vastly divergent sequences populate a majority of protein folds. In the quest to identify features that are conserved within protein domains belonging to the same fold, we set out to examine the entire protein universe on a fold-by-fold basis. We report that the atomic interaction network in the solvent-unexposed core of protein domains are fold-conserved, extraordinary sequence divergence notwithstanding. Further, we find that this feature, termed protein core atomic interaction network (or PCAIN) is significantly distinguishable across different folds, thus appearing to be "signature" of a domain's native fold. As part of this study, we computed the PCAINs for 8698 representative protein domains from families across the 1018 known protein folds to construct our seed database and an automated framework was developed for PCAIN-based characterization of the protein fold universe. A test set of randomly selected domains that are not in the seed database was classified with over 97% accuracy, independent of sequence divergence. As an application of this novel fold signature, a PCAIN-based scoring scheme was developed for comparative (homology-based) structure prediction, with 1-2 angstroms (mean 1.61A) Cα RMSD generally observed between computed structures and reference crystal structures. Our results are consistent across the full spectrum of test domains including those from recent CASP experiments and most notably in the 'twilight' and 'midnight' zones wherein < 30% and < 10% target-template sequence identity prevails (mean twilight RMSD of 1.69A). We further demonstrate the utility of the PCAIN protocol to derive biological insight into protein structure-function relationships, by modeling the structure of the YopM effector novel E3 ligase (NEL) domain from plaguecausative bacterium Yersinia Pestis and discussing its implications for host adaptive and innate immune modulation by the pathogen. Considering the several high-throughput, sequence-identity-independent applications demonstrated in this work, we suggest that the PCAIN is a fundamental fold feature that could be a valuable addition to the arsenal of protein modeling and analysis tools. 2018-06-11T14:40:57Z 2018-06-11T14:40:57Z 2010-02 2009-12 2018-06-08T17:45:49Z Article http://purl.org/eprint/type/JournalArticle 1932-6203 http://hdl.handle.net/1721.1/116196 Soundararajan, Venkataramanan et al. “Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds.” Edited by Neeraj Vij. PLoS ONE 5, 2 (February 2010): e9391 © 2010 Soundararajan et al https://orcid.org/0000-0002-2085-7840 http://dx.doi.org/10.1371/journal.pone.0009391 PLoS ONE Attribution 4.0 International (CC BY 4.0) https://creativecommons.org/licenses/by/4.0/ application/pdf Public Library of Science (PLoS) PLoS
spellingShingle Soundararajan, Venkataramanan
Raman, Rahul
Raguram, S
Sasisekharan, Viswanathan
Sasisekharan, Ram
Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title_full Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title_fullStr Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title_full_unstemmed Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title_short Atomic Interaction Networks in the Core of Protein Domains and Their Native Folds
title_sort atomic interaction networks in the core of protein domains and their native folds
url http://hdl.handle.net/1721.1/116196
https://orcid.org/0000-0002-2085-7840
work_keys_str_mv AT soundararajanvenkataramanan atomicinteractionnetworksinthecoreofproteindomainsandtheirnativefolds
AT ramanrahul atomicinteractionnetworksinthecoreofproteindomainsandtheirnativefolds
AT ragurams atomicinteractionnetworksinthecoreofproteindomainsandtheirnativefolds
AT sasisekharanviswanathan atomicinteractionnetworksinthecoreofproteindomainsandtheirnativefolds
AT sasisekharanram atomicinteractionnetworksinthecoreofproteindomainsandtheirnativefolds