Protein loop structure prediction

<p>This dissertation concerns the study and prediction of loops in protein structures.</p><p> Proteins perform crucial functions in living organisms. Despite their importance, we are currently unable to predict their three dimensional structure accurately.</p><p> Loops...

Full description

Bibliographic Details
Main Author: Choi, Y
Other Authors: Deane, C
Format: Thesis
Language:English
Published: 2011
Subjects:
_version_ 1826294032757686272
author Choi, Y
author2 Deane, C
author_facet Deane, C
Choi, Y
author_sort Choi, Y
collection OXFORD
description <p>This dissertation concerns the study and prediction of loops in protein structures.</p><p> Proteins perform crucial functions in living organisms. Despite their importance, we are currently unable to predict their three dimensional structure accurately.</p><p> Loops are segments that connect regular secondary structures of proteins. They tend to be located on the surface of proteins and often interact with other biological agents. As loops are generally subject to more frequent mutations than the rest of the protein, their sequences and structural conformations can vary significantly even within the same protein family. Although homology modelling is the most accurate computational method for protein structure prediction, difficulties still arise in predicting protein loops. Protein loop structure prediction is therefore a bottleneck in solving the protein structure prediction problem.</p><p> Reflecting on the success of homology modelling, I implement an improved version of a database search method, FREAD. I show how sequence similarity as quantified by environment specific substitution scores can be used to significantly improve loop prediction.</p><p> FREAD performs appreciably better for an identifiable subset of loops (two thirds of shorter loops and half of the longer loops tested) than <em>ab initio</em> methods; FREAD's predictive ability is length independent. In general, it produces results within 2Å root mean square deviation (RMSD) from the native conformations, compared to an average of over 10Å for loop length 20 for any of the other tested <em>ab initio</em> methods.</p><p> I then examine FREAD’s predictive ability on a specific type of loops called complementarity determining regions (CDRs) in antibodies. CDRs consist of six hypervariable loops and form the majority of the antigen binding site. I examine CDR loop structure prediction as a general case of loop structure prediction problem. FREAD achieves accuracy similar to specific CDR predictors. However, it fails to accurately predict CDR-H3, which is known to be the most challenging CDR. Various FREAD versions including FREAD with contact information (ConFREAD) are examined. The FREAD variants improve predictions for CDR-H3 on homology models and docked structures.</p><p> Lastly, I focus on the local properties of protein loops and demonstrate that the protein loop structure prediction problem is a local protein folding problem. The end-to-end distance of loops (loop span) follows a distinctive frequency distribution, regardless of secondary structure elements connected or the number of residues in the loop. I show that the loop span distribution follows a Maxwell-Boltzmann distribution.</p><p> Based on my research, I propose future directions in protein loop structure prediction including estimating experimentally undetermined local structures using FREAD, multiple loop structure prediction using contact information and a novel <em>ab initio</em> method which makes use of loop stretch.</p>
first_indexed 2024-03-07T03:39:22Z
format Thesis
id oxford-uuid:bd5c1b9b-89ba-4225-bc17-85d3f5067e58
institution University of Oxford
language English
last_indexed 2024-03-07T03:39:22Z
publishDate 2011
record_format dspace
spelling oxford-uuid:bd5c1b9b-89ba-4225-bc17-85d3f5067e582022-03-27T05:31:15ZProtein loop structure predictionThesishttp://purl.org/coar/resource_type/c_db06uuid:bd5c1b9b-89ba-4225-bc17-85d3f5067e58Protein foldingStatistics (see also social sciences)Bioinformatics (life sciences)Bioinformatics (biochemistry)EnglishOxford University Research Archive - Valet2011Choi, YDeane, C<p>This dissertation concerns the study and prediction of loops in protein structures.</p><p> Proteins perform crucial functions in living organisms. Despite their importance, we are currently unable to predict their three dimensional structure accurately.</p><p> Loops are segments that connect regular secondary structures of proteins. They tend to be located on the surface of proteins and often interact with other biological agents. As loops are generally subject to more frequent mutations than the rest of the protein, their sequences and structural conformations can vary significantly even within the same protein family. Although homology modelling is the most accurate computational method for protein structure prediction, difficulties still arise in predicting protein loops. Protein loop structure prediction is therefore a bottleneck in solving the protein structure prediction problem.</p><p> Reflecting on the success of homology modelling, I implement an improved version of a database search method, FREAD. I show how sequence similarity as quantified by environment specific substitution scores can be used to significantly improve loop prediction.</p><p> FREAD performs appreciably better for an identifiable subset of loops (two thirds of shorter loops and half of the longer loops tested) than <em>ab initio</em> methods; FREAD's predictive ability is length independent. In general, it produces results within 2Å root mean square deviation (RMSD) from the native conformations, compared to an average of over 10Å for loop length 20 for any of the other tested <em>ab initio</em> methods.</p><p> I then examine FREAD’s predictive ability on a specific type of loops called complementarity determining regions (CDRs) in antibodies. CDRs consist of six hypervariable loops and form the majority of the antigen binding site. I examine CDR loop structure prediction as a general case of loop structure prediction problem. FREAD achieves accuracy similar to specific CDR predictors. However, it fails to accurately predict CDR-H3, which is known to be the most challenging CDR. Various FREAD versions including FREAD with contact information (ConFREAD) are examined. The FREAD variants improve predictions for CDR-H3 on homology models and docked structures.</p><p> Lastly, I focus on the local properties of protein loops and demonstrate that the protein loop structure prediction problem is a local protein folding problem. The end-to-end distance of loops (loop span) follows a distinctive frequency distribution, regardless of secondary structure elements connected or the number of residues in the loop. I show that the loop span distribution follows a Maxwell-Boltzmann distribution.</p><p> Based on my research, I propose future directions in protein loop structure prediction including estimating experimentally undetermined local structures using FREAD, multiple loop structure prediction using contact information and a novel <em>ab initio</em> method which makes use of loop stretch.</p>
spellingShingle Protein folding
Statistics (see also social sciences)
Bioinformatics (life sciences)
Bioinformatics (biochemistry)
Choi, Y
Protein loop structure prediction
title Protein loop structure prediction
title_full Protein loop structure prediction
title_fullStr Protein loop structure prediction
title_full_unstemmed Protein loop structure prediction
title_short Protein loop structure prediction
title_sort protein loop structure prediction
topic Protein folding
Statistics (see also social sciences)
Bioinformatics (life sciences)
Bioinformatics (biochemistry)
work_keys_str_mv AT choiy proteinloopstructureprediction