CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
<p>Abstract</p> <p>Background</p> <p>One aim of the <it>in silico </it>characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally importa...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2012-04-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/13/55 |
_version_ | 1818149889696071680 |
---|---|
author | Janda Jan-Oliver Busch Markus Kück Fabian Porfenenko Mikhail Merkl Rainer |
author_facet | Janda Jan-Oliver Busch Markus Kück Fabian Porfenenko Mikhail Merkl Rainer |
author_sort | Janda Jan-Oliver |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>One aim of the <it>in silico </it>characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets.</p> <p>Results</p> <p>We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific <it>p</it>-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at <url>http://www-bioinf.uni-regensburg.de/</url>.</p> <p>Conclusions</p> <p>CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.</p> |
first_indexed | 2024-12-11T13:14:13Z |
format | Article |
id | doaj.art-791058a053b744539f473dad7cf69a44 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-11T13:14:13Z |
publishDate | 2012-04-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-791058a053b744539f473dad7cf69a442022-12-22T01:06:07ZengBMCBMC Bioinformatics1471-21052012-04-011315510.1186/1471-2105-13-55CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structureJanda Jan-OliverBusch MarkusKück FabianPorfenenko MikhailMerkl Rainer<p>Abstract</p> <p>Background</p> <p>One aim of the <it>in silico </it>characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets.</p> <p>Results</p> <p>We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific <it>p</it>-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at <url>http://www-bioinf.uni-regensburg.de/</url>.</p> <p>Conclusions</p> <p>CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.</p>http://www.biomedcentral.com/1471-2105/13/55 |
spellingShingle | Janda Jan-Oliver Busch Markus Kück Fabian Porfenenko Mikhail Merkl Rainer CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure BMC Bioinformatics |
title | CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure |
title_full | CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure |
title_fullStr | CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure |
title_full_unstemmed | CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure |
title_short | CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure |
title_sort | clips 1d analysis of multiple sequence alignments to deduce for residue positions a role in catalysis ligand binding or protein structure |
url | http://www.biomedcentral.com/1471-2105/13/55 |
work_keys_str_mv | AT jandajanoliver clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure AT buschmarkus clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure AT kuckfabian clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure AT porfenenkomikhail clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure AT merklrainer clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure |