Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria

We have determined refined multidimensional chemical shift ranges for intra-residue correlations ([superscript 13]C–[superscript 13]C, [superscript 15]N–[superscript 13]C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectr...

Full description

Bibliographic Details
Main Authors: Schmidt-Rohr, Klaus, Fritzsching, Keith J., Hong, Mei
Other Authors: Massachusetts Institute of Technology. Department of Chemistry
Format: Article
Language:English
Published: Springer Netherlands 2016
Online Access:http://hdl.handle.net/1721.1/105515
https://orcid.org/0000-0001-5255-5858
_version_ 1811072362121003008
author Schmidt-Rohr, Klaus
Fritzsching, Keith J.
Hong, Mei
author2 Massachusetts Institute of Technology. Department of Chemistry
author_facet Massachusetts Institute of Technology. Department of Chemistry
Schmidt-Rohr, Klaus
Fritzsching, Keith J.
Hong, Mei
author_sort Schmidt-Rohr, Klaus
collection MIT
description We have determined refined multidimensional chemical shift ranges for intra-residue correlations ([superscript 13]C–[superscript 13]C, [superscript 15]N–[superscript 13]C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectra. The chemical-shift ranges are the result of a statistical analysis of the PACSY database of >3000 proteins with 3D structures (1,200,207 [superscript 13]C chemical shifts and >3 million chemical shifts in total); these data were originally derived from the Biological Magnetic Resonance Data Bank. Using relatively simple non-parametric statistics to find peak maxima in the distributions of helix, sheet, coil and turn chemical shifts, and without the use of limited “hand-picked” data sets, we show that ~94 % of the [superscript 13]C NMR data and almost all [superscript 15]N data are quite accurately referenced and assigned, with smaller standard deviations (0.2 and 0.8 ppm, respectively) than recognized previously. On the other hand, approximately 6 % of the [superscript 13]C chemical shift data in the PACSY database are shown to be clearly misreferenced, mostly by ca. −2.4 ppm. The removal of the misreferenced data and other outliers by this purging by intrinsic quality criteria (PIQC) allows for reliable identification of secondary maxima in the two-dimensional chemical-shift distributions already pre-separated by secondary structure. We demonstrate that some of these correspond to specific regions in the Ramachandran plot, including left-handed helix dihedral angles, reflect unusual hydrogen bonding, or are due to the influence of a following proline residue. With appropriate smoothing, significantly more tightly defined chemical shift ranges are obtained for each amino acid type in the different secondary structures. These chemical shift ranges, which may be defined at any statistical threshold, can be used for amino-acid type assignment and secondary-structure analysis of chemical shifts from intra-residue cross peaks by inspection or by using a provided command-line Python script (PLUQin), which should be useful in protein structure determination. The refined chemical shift distributions are utilized in a simple quality test (SQAT) that should be applied to new protein NMR data before deposition in a databank, and they could benefit many other chemical-shift based tools.
first_indexed 2024-09-23T09:04:45Z
format Article
id mit-1721.1/105515
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T09:04:45Z
publishDate 2016
publisher Springer Netherlands
record_format dspace
spelling mit-1721.1/1055152022-09-26T10:19:34Z Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria Schmidt-Rohr, Klaus Fritzsching, Keith J. Hong, Mei Massachusetts Institute of Technology. Department of Chemistry Hong, Mei We have determined refined multidimensional chemical shift ranges for intra-residue correlations ([superscript 13]C–[superscript 13]C, [superscript 15]N–[superscript 13]C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectra. The chemical-shift ranges are the result of a statistical analysis of the PACSY database of >3000 proteins with 3D structures (1,200,207 [superscript 13]C chemical shifts and >3 million chemical shifts in total); these data were originally derived from the Biological Magnetic Resonance Data Bank. Using relatively simple non-parametric statistics to find peak maxima in the distributions of helix, sheet, coil and turn chemical shifts, and without the use of limited “hand-picked” data sets, we show that ~94 % of the [superscript 13]C NMR data and almost all [superscript 15]N data are quite accurately referenced and assigned, with smaller standard deviations (0.2 and 0.8 ppm, respectively) than recognized previously. On the other hand, approximately 6 % of the [superscript 13]C chemical shift data in the PACSY database are shown to be clearly misreferenced, mostly by ca. −2.4 ppm. The removal of the misreferenced data and other outliers by this purging by intrinsic quality criteria (PIQC) allows for reliable identification of secondary maxima in the two-dimensional chemical-shift distributions already pre-separated by secondary structure. We demonstrate that some of these correspond to specific regions in the Ramachandran plot, including left-handed helix dihedral angles, reflect unusual hydrogen bonding, or are due to the influence of a following proline residue. With appropriate smoothing, significantly more tightly defined chemical shift ranges are obtained for each amino acid type in the different secondary structures. These chemical shift ranges, which may be defined at any statistical threshold, can be used for amino-acid type assignment and secondary-structure analysis of chemical shifts from intra-residue cross peaks by inspection or by using a provided command-line Python script (PLUQin), which should be useful in protein structure determination. The refined chemical shift distributions are utilized in a simple quality test (SQAT) that should be applied to new protein NMR data before deposition in a databank, and they could benefit many other chemical-shift based tools. National Institutes of Health (U.S.) (Grant GM066976) 2016-12-01T22:53:38Z 2016-12-01T22:53:38Z 2016-01 2015-10 2016-08-18T15:18:51Z Article http://purl.org/eprint/type/JournalArticle 0925-2738 1573-5001 http://hdl.handle.net/1721.1/105515 Fritzsching, Keith J., Mei Hong, and Klaus Schmidt-Rohr. “Conformationally Selective Multidimensional Chemical Shift Ranges in Proteins from a PACSY Database Purged Using Intrinsic Quality Criteria.” J Biomol NMR 64, no. 2 (January 19, 2016): 115–130. https://orcid.org/0000-0001-5255-5858 en http://dx.doi.org/10.1007/s10858-016-0013-5 Journal of Biomolecular NMR Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ Springer Science+Business Media Dordrecht application/pdf Springer Netherlands Springer Netherlands
spellingShingle Schmidt-Rohr, Klaus
Fritzsching, Keith J.
Hong, Mei
Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title_full Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title_fullStr Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title_full_unstemmed Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title_short Conformationally selective multidimensional chemical shift ranges in proteins from a PACSY database purged using intrinsic quality criteria
title_sort conformationally selective multidimensional chemical shift ranges in proteins from a pacsy database purged using intrinsic quality criteria
url http://hdl.handle.net/1721.1/105515
https://orcid.org/0000-0001-5255-5858
work_keys_str_mv AT schmidtrohrklaus conformationallyselectivemultidimensionalchemicalshiftrangesinproteinsfromapacsydatabasepurgedusingintrinsicqualitycriteria
AT fritzschingkeithj conformationallyselectivemultidimensionalchemicalshiftrangesinproteinsfromapacsydatabasepurgedusingintrinsicqualitycriteria
AT hongmei conformationallyselectivemultidimensionalchemicalshiftrangesinproteinsfromapacsydatabasepurgedusingintrinsicqualitycriteria