Calibrating E-values for MS<sup>2 </sup>database search methods

<p>Abstract</p> <p>Background</p> <p>The key to mass-spectrometry-based proteomics is peptide identification, which relies on software analysis of tandem mass spectra. Although each search engine has its strength, combining the strengths of various search engines is not...

Full description

Bibliographic Details
Main Authors: Shen Rong-Fong, Wang Guanghui, Wu Wells W, Ogurtsov Aleksey Y, Alves Gelio, Yu Yi-Kuo
Format: Article
Language:English
Published: BMC 2007-11-01
Series:Biology Direct
Online Access:http://www.biology-direct.com/content/2/1/26
_version_ 1818393289924018176
author Shen Rong-Fong
Wang Guanghui
Wu Wells W
Ogurtsov Aleksey Y
Alves Gelio
Yu Yi-Kuo
author_facet Shen Rong-Fong
Wang Guanghui
Wu Wells W
Ogurtsov Aleksey Y
Alves Gelio
Yu Yi-Kuo
author_sort Shen Rong-Fong
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>The key to mass-spectrometry-based proteomics is peptide identification, which relies on software analysis of tandem mass spectra. Although each search engine has its strength, combining the strengths of various search engines is not yet realizable largely due to the lack of a unified statistical framework that is applicable to any method.</p> <p>Results</p> <p>We have developed a universal scheme for statistical calibration of peptide identifications. The protocol can be used for both <it>de novo </it>approaches as well as database search methods. We demonstrate the protocol using only the database search methods. Among seven methods -SEQUEST (v27 rev12), ProbID (v1.0), InsPecT (v20060505), Mascot (v2.1), X!Tandem (v1.0), OMSSA (v2.0) and RAId_DbS – calibrated, except for X!Tandem and RAId_DbS most methods require a rescaling according to the database size searched. We demonstrate that our calibration protocol indeed produces unified statistics both in terms of average number of false positives and in terms of the probability for a peptide hit to be a true positive. Although both the protocols for calibration and the statistics thus calibrated are universal, the calibration formulas obtained from one laboratory with data collected using either centroid or profile format may not be directly usable by the other laboratories. Thus each laboratory is encouraged to calibrate the search methods it intends to use. We also address the importance of using spectrum-specific statistics and possible improvement on the current calibration protocol. The spectra used for statistical (<it>E</it>-value) calibration are freely available upon request.</p> <p>Open peer review</p> <p>Reviewed by Dongxiao Zhu (nominated by Arcady Mushegian), Alexey Nesvizhskii (nominated by King Jordan) and Vineet Bafna. For the full reviews, please go to the Reviewers' comments section.</p>
first_indexed 2024-12-14T05:42:57Z
format Article
id doaj.art-afb29d043c16402f8fef6fdfd9f51649
institution Directory Open Access Journal
issn 1745-6150
language English
last_indexed 2024-12-14T05:42:57Z
publishDate 2007-11-01
publisher BMC
record_format Article
series Biology Direct
spelling doaj.art-afb29d043c16402f8fef6fdfd9f516492022-12-21T23:14:58ZengBMCBiology Direct1745-61502007-11-01212610.1186/1745-6150-2-26Calibrating E-values for MS<sup>2 </sup>database search methodsShen Rong-FongWang GuanghuiWu Wells WOgurtsov Aleksey YAlves GelioYu Yi-Kuo<p>Abstract</p> <p>Background</p> <p>The key to mass-spectrometry-based proteomics is peptide identification, which relies on software analysis of tandem mass spectra. Although each search engine has its strength, combining the strengths of various search engines is not yet realizable largely due to the lack of a unified statistical framework that is applicable to any method.</p> <p>Results</p> <p>We have developed a universal scheme for statistical calibration of peptide identifications. The protocol can be used for both <it>de novo </it>approaches as well as database search methods. We demonstrate the protocol using only the database search methods. Among seven methods -SEQUEST (v27 rev12), ProbID (v1.0), InsPecT (v20060505), Mascot (v2.1), X!Tandem (v1.0), OMSSA (v2.0) and RAId_DbS – calibrated, except for X!Tandem and RAId_DbS most methods require a rescaling according to the database size searched. We demonstrate that our calibration protocol indeed produces unified statistics both in terms of average number of false positives and in terms of the probability for a peptide hit to be a true positive. Although both the protocols for calibration and the statistics thus calibrated are universal, the calibration formulas obtained from one laboratory with data collected using either centroid or profile format may not be directly usable by the other laboratories. Thus each laboratory is encouraged to calibrate the search methods it intends to use. We also address the importance of using spectrum-specific statistics and possible improvement on the current calibration protocol. The spectra used for statistical (<it>E</it>-value) calibration are freely available upon request.</p> <p>Open peer review</p> <p>Reviewed by Dongxiao Zhu (nominated by Arcady Mushegian), Alexey Nesvizhskii (nominated by King Jordan) and Vineet Bafna. For the full reviews, please go to the Reviewers' comments section.</p>http://www.biology-direct.com/content/2/1/26
spellingShingle Shen Rong-Fong
Wang Guanghui
Wu Wells W
Ogurtsov Aleksey Y
Alves Gelio
Yu Yi-Kuo
Calibrating E-values for MS<sup>2 </sup>database search methods
Biology Direct
title Calibrating E-values for MS<sup>2 </sup>database search methods
title_full Calibrating E-values for MS<sup>2 </sup>database search methods
title_fullStr Calibrating E-values for MS<sup>2 </sup>database search methods
title_full_unstemmed Calibrating E-values for MS<sup>2 </sup>database search methods
title_short Calibrating E-values for MS<sup>2 </sup>database search methods
title_sort calibrating e values for ms sup 2 sup database search methods
url http://www.biology-direct.com/content/2/1/26
work_keys_str_mv AT shenrongfong calibratingevaluesformssup2supdatabasesearchmethods
AT wangguanghui calibratingevaluesformssup2supdatabasesearchmethods
AT wuwellsw calibratingevaluesformssup2supdatabasesearchmethods
AT ogurtsovalekseyy calibratingevaluesformssup2supdatabasesearchmethods
AT alvesgelio calibratingevaluesformssup2supdatabasesearchmethods
AT yuyikuo calibratingevaluesformssup2supdatabasesearchmethods