Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores

Background: Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment b...

Full description

Bibliographic Details
Main Authors: Salomon, J, Flower, D
Format: Journal article
Language:English
Published: BioMed Central 2006
Subjects:
_version_ 1797051797062287360
author Salomon, J
Flower, D
author_facet Salomon, J
Flower, D
author_sort Salomon, J
collection OXFORD
description Background: Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results: The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion: The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.
first_indexed 2024-03-06T18:24:27Z
format Journal article
id oxford-uuid:07743f9d-29f4-49b3-8f2b-2932ca636aa7
institution University of Oxford
language English
last_indexed 2024-03-06T18:24:27Z
publishDate 2006
publisher BioMed Central
record_format dspace
spelling oxford-uuid:07743f9d-29f4-49b3-8f2b-2932ca636aa72022-03-26T09:07:38ZPredicting Class II MHC-Peptide binding: a kernel based approach using similarity scoresJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:07743f9d-29f4-49b3-8f2b-2932ca636aa7Bioinformatics (life sciences)EnglishOxford University Research Archive - ValetBioMed Central2006Salomon, JFlower, DBackground: Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results: The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion: The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.
spellingShingle Bioinformatics (life sciences)
Salomon, J
Flower, D
Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title_full Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title_fullStr Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title_full_unstemmed Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title_short Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores
title_sort predicting class ii mhc peptide binding a kernel based approach using similarity scores
topic Bioinformatics (life sciences)
work_keys_str_mv AT salomonj predictingclassiimhcpeptidebindingakernelbasedapproachusingsimilarityscores
AT flowerd predictingclassiimhcpeptidebindingakernelbasedapproachusingsimilarityscores