Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.

We recently reported statistical analysis of structural data on glycosidic linkages. Here we extend this analysis to the glycan-protein linkage, and the peptide primary, secondary, and tertiary structures around N-glycosylation sites. We surveyed 506 glycoproteins in the Protein Data Bank crystallog...

Full description

Bibliographic Details
Main Authors: Petrescu, A, Milac, A, Petrescu, S, Dwek, R, Wormald, M
Format: Journal article
Language:English
Published: 2004
_version_ 1797083873935360000
author Petrescu, A
Milac, A
Petrescu, S
Dwek, R
Wormald, M
author_facet Petrescu, A
Milac, A
Petrescu, S
Dwek, R
Wormald, M
author_sort Petrescu, A
collection OXFORD
description We recently reported statistical analysis of structural data on glycosidic linkages. Here we extend this analysis to the glycan-protein linkage, and the peptide primary, secondary, and tertiary structures around N-glycosylation sites. We surveyed 506 glycoproteins in the Protein Data Bank crystallographic database, giving 2592 glycosylation sequons (1683 occupied) and generated a database of 626 nonredundant sequons with 386 occupied. Deviations in the expected amino acid composition were seen around occupied asparagines, particularly an increased occurrence of aromatic residues before the asparagine and threonine at position +2. Glycosylation alters the asparagine side chain torsion angle distribution and reduces its flexibility. There is an elevated probability of finding glycosylation sites in which secondary structure changes. An 11-class taxonomy was developed to describe protein surface geometry around glycosylation sites. Thirty-three percent of the occupied sites are on exposed convex surfaces, 10% in deep recesses and 20% on the edge of grooves with the glycan filling the cleft. A surprisingly large number of glycosylated asparagine residues have a low accessibility. The incidence of aromatic amino acids brought into close contact with the glycan by the folding process is higher than their normal levels on the surface or in the protein core. These data have significant implications for control of sequon occupancy and evolutionary selection of glycosylation sites and are discussed in relation to mechanisms of protein fold stabilization and regional quality control of protein folding. Hydrophobic protein-glycan interactions and the low accessibility of glycosylation sites in folded proteins are common features and may be critical in mediating these functions.
first_indexed 2024-03-07T01:47:39Z
format Journal article
id oxford-uuid:98f998f8-c1cc-4cfc-b6fe-ee768290ed8d
institution University of Oxford
language English
last_indexed 2024-03-07T01:47:39Z
publishDate 2004
record_format dspace
spelling oxford-uuid:98f998f8-c1cc-4cfc-b6fe-ee768290ed8d2022-03-27T00:10:51ZStatistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:98f998f8-c1cc-4cfc-b6fe-ee768290ed8dEnglishSymplectic Elements at Oxford2004Petrescu, AMilac, APetrescu, SDwek, RWormald, MWe recently reported statistical analysis of structural data on glycosidic linkages. Here we extend this analysis to the glycan-protein linkage, and the peptide primary, secondary, and tertiary structures around N-glycosylation sites. We surveyed 506 glycoproteins in the Protein Data Bank crystallographic database, giving 2592 glycosylation sequons (1683 occupied) and generated a database of 626 nonredundant sequons with 386 occupied. Deviations in the expected amino acid composition were seen around occupied asparagines, particularly an increased occurrence of aromatic residues before the asparagine and threonine at position +2. Glycosylation alters the asparagine side chain torsion angle distribution and reduces its flexibility. There is an elevated probability of finding glycosylation sites in which secondary structure changes. An 11-class taxonomy was developed to describe protein surface geometry around glycosylation sites. Thirty-three percent of the occupied sites are on exposed convex surfaces, 10% in deep recesses and 20% on the edge of grooves with the glycan filling the cleft. A surprisingly large number of glycosylated asparagine residues have a low accessibility. The incidence of aromatic amino acids brought into close contact with the glycan by the folding process is higher than their normal levels on the surface or in the protein core. These data have significant implications for control of sequon occupancy and evolutionary selection of glycosylation sites and are discussed in relation to mechanisms of protein fold stabilization and regional quality control of protein folding. Hydrophobic protein-glycan interactions and the low accessibility of glycosylation sites in folded proteins are common features and may be critical in mediating these functions.
spellingShingle Petrescu, A
Milac, A
Petrescu, S
Dwek, R
Wormald, M
Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title_full Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title_fullStr Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title_full_unstemmed Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title_short Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding.
title_sort statistical analysis of the protein environment of n glycosylation sites implications for occupancy structure and folding
work_keys_str_mv AT petrescua statisticalanalysisoftheproteinenvironmentofnglycosylationsitesimplicationsforoccupancystructureandfolding
AT milaca statisticalanalysisoftheproteinenvironmentofnglycosylationsitesimplicationsforoccupancystructureandfolding
AT petrescus statisticalanalysisoftheproteinenvironmentofnglycosylationsitesimplicationsforoccupancystructureandfolding
AT dwekr statisticalanalysisoftheproteinenvironmentofnglycosylationsitesimplicationsforoccupancystructureandfolding
AT wormaldm statisticalanalysisoftheproteinenvironmentofnglycosylationsitesimplicationsforoccupancystructureandfolding