Sequence variation in ligand binding sites in proteins

<p>Abstract</p> <p>Background</p> <p>The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from s...

Full description

Bibliographic Details
Main Authors: Magliery Thomas J, Regan Lynne
Format: Article
Language:English
Published: BMC 2005-09-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/6/240
_version_ 1818515648164134912
author Magliery Thomas J
Regan Lynne
author_facet Magliery Thomas J
Regan Lynne
author_sort Magliery Thomas J
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance.</p> <p>Results</p> <p>Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys<sub>2</sub>His<sub>2 </sub>zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies.</p> <p>Conclusion</p> <p>Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites.</p>
first_indexed 2024-12-11T00:31:32Z
format Article
id doaj.art-817b7a094609442b82b7e34683453a7c
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T00:31:32Z
publishDate 2005-09-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-817b7a094609442b82b7e34683453a7c2022-12-22T01:27:20ZengBMCBMC Bioinformatics1471-21052005-09-016124010.1186/1471-2105-6-240Sequence variation in ligand binding sites in proteinsMagliery Thomas JRegan Lynne<p>Abstract</p> <p>Background</p> <p>The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance.</p> <p>Results</p> <p>Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys<sub>2</sub>His<sub>2 </sub>zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies.</p> <p>Conclusion</p> <p>Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites.</p>http://www.biomedcentral.com/1471-2105/6/240
spellingShingle Magliery Thomas J
Regan Lynne
Sequence variation in ligand binding sites in proteins
BMC Bioinformatics
title Sequence variation in ligand binding sites in proteins
title_full Sequence variation in ligand binding sites in proteins
title_fullStr Sequence variation in ligand binding sites in proteins
title_full_unstemmed Sequence variation in ligand binding sites in proteins
title_short Sequence variation in ligand binding sites in proteins
title_sort sequence variation in ligand binding sites in proteins
url http://www.biomedcentral.com/1471-2105/6/240
work_keys_str_mv AT maglierythomasj sequencevariationinligandbindingsitesinproteins
AT reganlynne sequencevariationinligandbindingsitesinproteins