Determinants of antigenicity and specificity in immune response for protein sequences

<p>Abstract</p> <p>Background</p> <p>Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical as...

Full description

Bibliographic Details
Main Authors: Li Cheng, White Kevin P, Negre Nicolas N, Wu Wenjun, Wang Yulong, Shah Parantu K
Format: Article
Language:English
Published: BMC 2011-06-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/12/251
_version_ 1818060578456862720
author Li Cheng
White Kevin P
Negre Nicolas N
Wu Wenjun
Wang Yulong
Shah Parantu K
author_facet Li Cheng
White Kevin P
Negre Nicolas N
Wu Wenjun
Wang Yulong
Shah Parantu K
author_sort Li Cheng
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies.</p> <p>Results</p> <p>Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on <it>fly </it>embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database.</p> <p>Conclusions</p> <p>Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at <url>https://sites.google.com/site/oracleclassifiers/</url>.</p>
first_indexed 2024-12-10T13:34:39Z
format Article
id doaj.art-54af7e4f55f341f498ba323583ade0f6
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-10T13:34:39Z
publishDate 2011-06-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-54af7e4f55f341f498ba323583ade0f62022-12-22T01:46:51ZengBMCBMC Bioinformatics1471-21052011-06-0112125110.1186/1471-2105-12-251Determinants of antigenicity and specificity in immune response for protein sequencesLi ChengWhite Kevin PNegre Nicolas NWu WenjunWang YulongShah Parantu K<p>Abstract</p> <p>Background</p> <p>Target specific antibodies are pivotal for the design of vaccines, immunodiagnostic tests, studies on proteomics for cancer biomarker discovery, identification of protein-DNA and other interactions, and small and large biochemical assays. Therefore, it is important to understand the properties of protein sequences that are important for antigenicity and to identify small peptide epitopes and large regions in the linear sequence of the proteins whose utilization result in specific antibodies.</p> <p>Results</p> <p>Our analysis using protein properties suggested that sequence composition combined with evolutionary information and predicted secondary structure, as well as solvent accessibility is sufficient to predict successful peptide epitopes. The antigenicity and the specificity in immune response were also found to depend on the epitope length. We trained the B-Cell Epitope Oracle (BEOracle), a support vector machine (SVM) classifier, for the identification of continuous B-Cell epitopes with these protein properties as learning features. The BEOracle achieved an F1-measure of 81.37% on a large validation set. The BEOracle classifier outperformed the classical methods based on propensity and sophisticated methods like BCPred and Bepipred for B-Cell epitope prediction. The BEOracle classifier also identified peptides for the ChIP-grade antibodies from the modENCODE/ENCODE projects with 96.88% accuracy. High BEOracle score for peptides showed some correlation with the antibody intensity on Immunofluorescence studies done on <it>fly </it>embryos. Finally, a second SVM classifier, the B-Cell Region Oracle (BROracle) was trained with the BEOracle scores as features to predict the performance of antibodies generated with large protein regions with high accuracy. The BROracle classifier achieved accuracies of 75.26-63.88% on a validation set with immunofluorescence, immunohistochemistry, protein arrays and western blot results from Protein Atlas database.</p> <p>Conclusions</p> <p>Together our results suggest that antigenicity is a local property of the protein sequences and that protein sequence properties of composition, secondary structure, solvent accessibility and evolutionary conservation are the determinants of antigenicity and specificity in immune response. Moreover, specificity in immune response could also be accurately predicted for large protein regions without the knowledge of the protein tertiary structure or the presence of discontinuous epitopes. The dataset prepared in this work and the classifier models are available for download at <url>https://sites.google.com/site/oracleclassifiers/</url>.</p>http://www.biomedcentral.com/1471-2105/12/251
spellingShingle Li Cheng
White Kevin P
Negre Nicolas N
Wu Wenjun
Wang Yulong
Shah Parantu K
Determinants of antigenicity and specificity in immune response for protein sequences
BMC Bioinformatics
title Determinants of antigenicity and specificity in immune response for protein sequences
title_full Determinants of antigenicity and specificity in immune response for protein sequences
title_fullStr Determinants of antigenicity and specificity in immune response for protein sequences
title_full_unstemmed Determinants of antigenicity and specificity in immune response for protein sequences
title_short Determinants of antigenicity and specificity in immune response for protein sequences
title_sort determinants of antigenicity and specificity in immune response for protein sequences
url http://www.biomedcentral.com/1471-2105/12/251
work_keys_str_mv AT licheng determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT whitekevinp determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT negrenicolasn determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT wuwenjun determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT wangyulong determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences
AT shahparantuk determinantsofantigenicityandspecificityinimmuneresponseforproteinsequences