Recognizing the fold of a protein structure.

This paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsmän: Harrison, A, Pearl, F, Sillitoe, I, Slidel, T, Mott, R, Thornton, J, Orengo, C
Materialtyp: Journal article
Språk:English
Publicerad: 2003
_version_ 1826287432265367552
author Harrison, A
Pearl, F
Sillitoe, I
Slidel, T
Mott, R
Thornton, J
Orengo, C
author_facet Harrison, A
Pearl, F
Sillitoe, I
Slidel, T
Mott, R
Thornton, J
Orengo, C
author_sort Harrison, A
collection OXFORD
description This paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been classified previously in the CATH database of structural domains. GRATH uses a measure of similarity that details the geometric information, number of secondary structures and number of residues within secondary structures, that any two protein structures share. Although GRATH builds on well established approaches for secondary structure comparison, a novel scoring scheme has been introduced to allow ranking of any matches identified by the algorithm. More importantly, we have benchmarked the algorithm using a large dataset of 1702 non-redundant structures from the CATH database which have already been classified into fold groups, with manual validation. This has facilitated introduction of further constraints, optimization of parameters and identification of reliable thresholds for fold identification. Following these benchmarking trials, the correct fold can be identified with the top score with a frequency of 90%. It is identified within the ten most likely assignments with a frequency of 98%. GRATH has been implemented to use via a server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl). GRATH's speed and accuracy means that it can be used as a reliable front-end filter for the more accurate, but computationally expensive, residue based structure comparison algorithm SSAP, currently used to classify domain structures in the CATH database. With an increasing number of structures being solved by the structural genomics initiatives, the GRATH server also provides an essential resource for determining whether newly determined structures are related to any known structures from which functional properties may be inferred.
first_indexed 2024-03-07T01:58:35Z
format Journal article
id oxford-uuid:9c94e65a-ff12-4583-a3ff-d58db3eddedb
institution University of Oxford
language English
last_indexed 2024-03-07T01:58:35Z
publishDate 2003
record_format dspace
spelling oxford-uuid:9c94e65a-ff12-4583-a3ff-d58db3eddedb2022-03-27T00:36:58ZRecognizing the fold of a protein structure.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:9c94e65a-ff12-4583-a3ff-d58db3eddedbEnglishSymplectic Elements at Oxford2003Harrison, APearl, FSillitoe, ISlidel, TMott, RThornton, JOrengo, CThis paper reports a graph-theoretic program, GRATH, that rapidly, and accurately, matches a novel structure against a library of domain structures to find the most similar ones. GRATH generates distributions of scores by comparing the novel domain against the different types of folds that have been classified previously in the CATH database of structural domains. GRATH uses a measure of similarity that details the geometric information, number of secondary structures and number of residues within secondary structures, that any two protein structures share. Although GRATH builds on well established approaches for secondary structure comparison, a novel scoring scheme has been introduced to allow ranking of any matches identified by the algorithm. More importantly, we have benchmarked the algorithm using a large dataset of 1702 non-redundant structures from the CATH database which have already been classified into fold groups, with manual validation. This has facilitated introduction of further constraints, optimization of parameters and identification of reliable thresholds for fold identification. Following these benchmarking trials, the correct fold can be identified with the top score with a frequency of 90%. It is identified within the ten most likely assignments with a frequency of 98%. GRATH has been implemented to use via a server (http://www.biochem.ucl.ac.uk/cgi-bin/cath/Grath.pl). GRATH's speed and accuracy means that it can be used as a reliable front-end filter for the more accurate, but computationally expensive, residue based structure comparison algorithm SSAP, currently used to classify domain structures in the CATH database. With an increasing number of structures being solved by the structural genomics initiatives, the GRATH server also provides an essential resource for determining whether newly determined structures are related to any known structures from which functional properties may be inferred.
spellingShingle Harrison, A
Pearl, F
Sillitoe, I
Slidel, T
Mott, R
Thornton, J
Orengo, C
Recognizing the fold of a protein structure.
title Recognizing the fold of a protein structure.
title_full Recognizing the fold of a protein structure.
title_fullStr Recognizing the fold of a protein structure.
title_full_unstemmed Recognizing the fold of a protein structure.
title_short Recognizing the fold of a protein structure.
title_sort recognizing the fold of a protein structure
work_keys_str_mv AT harrisona recognizingthefoldofaproteinstructure
AT pearlf recognizingthefoldofaproteinstructure
AT sillitoei recognizingthefoldofaproteinstructure
AT slidelt recognizingthefoldofaproteinstructure
AT mottr recognizingthefoldofaproteinstructure
AT thorntonj recognizingthefoldofaproteinstructure
AT orengoc recognizingthefoldofaproteinstructure