A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis

Background: Comprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the molecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput experimental techniques have been developed to generate PPI maps at proteome...

Full description

Bibliographic Details
Main Authors: Loh, Po-Ru, Berger, Bonnie, Tucker, George Jay
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: BioMed Central Ltd 2014
Online Access:http://hdl.handle.net/1721.1/85992
https://orcid.org/0000-0002-2724-7228
_version_ 1811095664629645312
author Loh, Po-Ru
Berger, Bonnie
Tucker, George Jay
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Loh, Po-Ru
Berger, Bonnie
Tucker, George Jay
author_sort Loh, Po-Ru
collection MIT
description Background: Comprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the molecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput experimental techniques have been developed to generate PPI maps at proteome scale, first using yeast two-hybrid approaches and more recently via affinity purification combined with mass spectrometry (AP-MS). Unfortunately, data from both protocols are prone to both high false positive and false negative rates. To address these issues, many methods have been developed to post-process raw PPI data. However, with few exceptions, these methods only analyze binary experimental data (in which each potential interaction tested is deemed either observed or unobserved), neglecting quantitative information available from AP-MS such as spectral counts. Results: We propose a novel method for incorporating quantitative information from AP-MS data into existing PPI inference methods that analyze binary interaction data. Our approach introduces a probabilistic framework that models the statistical noise inherent in observations of co-purifications. Using a sampling-based approach, we model the uncertainty of interactions with low spectral counts by generating an ensemble of possible alternative experimental outcomes. We then apply the existing method of choice to each alternative outcome and aggregate results over the ensemble. We validate our approach on three recent AP-MS data sets and demonstrate performance comparable to or better than state-of-the-art methods. Additionally, we provide an in-depth discussion comparing the theoretical bases of existing approaches and identify common aspects that may be key to their performance. Conclusions: Our sampling framework extends the existing body of work on PPI analysis using binary interaction data to apply to the richer quantitative data now commonly available through AP-MS assays. This framework is quite general, and many enhancements are likely possible. Fruitful future directions may include investigating more sophisticated schemes for converting spectral counts to probabilities and applying the framework to direct protein complex prediction methods.
first_indexed 2024-09-23T16:23:26Z
format Article
id mit-1721.1/85992
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T16:23:26Z
publishDate 2014
publisher BioMed Central Ltd
record_format dspace
spelling mit-1721.1/859922022-09-29T19:43:20Z A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis Loh, Po-Ru Berger, Bonnie Tucker, George Jay Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Mathematics Tucker, George Jay Loh, Po-Ru Berger, Bonnie Background: Comprehensive protein-protein interaction (PPI) maps are a powerful resource for uncovering the molecular basis of genetic interactions and providing mechanistic insights. Over the past decade, high-throughput experimental techniques have been developed to generate PPI maps at proteome scale, first using yeast two-hybrid approaches and more recently via affinity purification combined with mass spectrometry (AP-MS). Unfortunately, data from both protocols are prone to both high false positive and false negative rates. To address these issues, many methods have been developed to post-process raw PPI data. However, with few exceptions, these methods only analyze binary experimental data (in which each potential interaction tested is deemed either observed or unobserved), neglecting quantitative information available from AP-MS such as spectral counts. Results: We propose a novel method for incorporating quantitative information from AP-MS data into existing PPI inference methods that analyze binary interaction data. Our approach introduces a probabilistic framework that models the statistical noise inherent in observations of co-purifications. Using a sampling-based approach, we model the uncertainty of interactions with low spectral counts by generating an ensemble of possible alternative experimental outcomes. We then apply the existing method of choice to each alternative outcome and aggregate results over the ensemble. We validate our approach on three recent AP-MS data sets and demonstrate performance comparable to or better than state-of-the-art methods. Additionally, we provide an in-depth discussion comparing the theoretical bases of existing approaches and identify common aspects that may be key to their performance. Conclusions: Our sampling framework extends the existing body of work on PPI analysis using binary interaction data to apply to the richer quantitative data now commonly available through AP-MS assays. This framework is quite general, and many enhancements are likely possible. Fruitful future directions may include investigating more sophisticated schemes for converting spectral counts to probabilities and applying the framework to direct protein complex prediction methods. National Human Genome Research Institute (U.S.) (Grant T32 HG002295) National Science Foundation (U.S.). Graduate Research Fellowship Program National Institutes of Health (U.S.) (NIH R01 Grant GM081871) 2014-04-03T16:01:41Z 2014-04-03T16:01:41Z 2013-10 2013-05 2014-04-02T15:20:15Z Article http://purl.org/eprint/type/JournalArticle 1471-2105 http://hdl.handle.net/1721.1/85992 Tucker, George, Po-Ru Loh, and Bonnie Berger. “A Sampling Framework for Incorporating Quantitative Mass Spectrometry Data in Protein Interaction Analysis.” BMC Bioinformatics 14.1 (2013): 299. https://orcid.org/0000-0002-2724-7228 en http://dx.doi.org/10.1186/1471-2105-14-299 BMC Bioinformatics Creative Commons Attribution http://creativecommons.org/licenses/by/2.0 George Tucker et al.; licensee BioMed Central Ltd. application/pdf BioMed Central Ltd BioMed Central Ltd
spellingShingle Loh, Po-Ru
Berger, Bonnie
Tucker, George Jay
A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title_full A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title_fullStr A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title_full_unstemmed A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title_short A sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
title_sort sampling framework for incorporating quantitative mass spectrometry data in protein interaction analysis
url http://hdl.handle.net/1721.1/85992
https://orcid.org/0000-0002-2724-7228
work_keys_str_mv AT lohporu asamplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis
AT bergerbonnie asamplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis
AT tuckergeorgejay asamplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis
AT lohporu samplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis
AT bergerbonnie samplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis
AT tuckergeorgejay samplingframeworkforincorporatingquantitativemassspectrometrydatainproteininteractionanalysis