Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics

<p>Abstract</p> <p>Background</p> <p>High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing me...

Full description

Bibliographic Details
Main Authors: Huber Christian G, Leinenbach Andreas, Pfeifer Nico, Kohlbacher Oliver
Format: Article
Language:English
Published: BMC 2007-11-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/468
_version_ 1831520852785496064
author Huber Christian G
Leinenbach Andreas
Pfeifer Nico
Kohlbacher Oliver
author_facet Huber Christian G
Leinenbach Andreas
Pfeifer Nico
Kohlbacher Oliver
author_sort Huber Christian G
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data.</p> <p>Results</p> <p>We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a <it>p</it>-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.</p> <p>Conclusion</p> <p>The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a <it>p</it>-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.</p>
first_indexed 2024-12-13T17:40:25Z
format Article
id doaj.art-9bedd321e9864179a94c5892841f2c75
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-13T17:40:25Z
publishDate 2007-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-9bedd321e9864179a94c5892841f2c752022-12-21T23:36:47ZengBMCBMC Bioinformatics1471-21052007-11-018146810.1186/1471-2105-8-468Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomicsHuber Christian GLeinenbach AndreasPfeifer NicoKohlbacher Oliver<p>Abstract</p> <p>Background</p> <p>High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data.</p> <p>Results</p> <p>We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a <it>p</it>-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.</p> <p>Conclusion</p> <p>The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a <it>p</it>-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.</p>http://www.biomedcentral.com/1471-2105/8/468
spellingShingle Huber Christian G
Leinenbach Andreas
Pfeifer Nico
Kohlbacher Oliver
Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
BMC Bioinformatics
title Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
title_full Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
title_fullStr Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
title_full_unstemmed Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
title_short Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics
title_sort statistical learning of peptide retention behavior in chromatographic separations a new kernel based approach for computational proteomics
url http://www.biomedcentral.com/1471-2105/8/468
work_keys_str_mv AT huberchristiang statisticallearningofpeptideretentionbehaviorinchromatographicseparationsanewkernelbasedapproachforcomputationalproteomics
AT leinenbachandreas statisticallearningofpeptideretentionbehaviorinchromatographicseparationsanewkernelbasedapproachforcomputationalproteomics
AT pfeifernico statisticallearningofpeptideretentionbehaviorinchromatographicseparationsanewkernelbasedapproachforcomputationalproteomics
AT kohlbacheroliver statisticallearningofpeptideretentionbehaviorinchromatographicseparationsanewkernelbasedapproachforcomputationalproteomics