Quantification of Uncertainty in Peptide-MHC Binding Prediction Improves High-Affinity Peptide Selection for Therapeutic Design

The computational identification of peptides that can bind the major histocompatibility complex (MHC) with high affinity is an essential step in developing personal immunotherapies and vaccines. We introduce PUFFIN, a deep residual network-based computational approach that quantifies uncertainty in...

Full description

Bibliographic Details
Main Authors: Zeng, Haoyang, Gifford, David K
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:English
Published: Elsevier BV 2020
Online Access:https://hdl.handle.net/1721.1/128919
Description
Summary:The computational identification of peptides that can bind the major histocompatibility complex (MHC) with high affinity is an essential step in developing personal immunotherapies and vaccines. We introduce PUFFIN, a deep residual network-based computational approach that quantifies uncertainty in peptide-MHC affinity prediction that arises from observational noise and the lack of relevant training examples. With PUFFIN's uncertainty metrics, we define binding likelihood, the probability a peptide binds to a given MHC allele at a specified affinity threshold. Compared to affinity point estimates, we find that binding likelihood correlates better with the observed affinity and reduces false positives in high-affinity peptide design. When applied to examine an existing peptide vaccine, PUFFIN identifies an alternative vaccine formulation with higher binding likelihood. PUFFIN is freely available for download at http://github.com/gifford-lab/PUFFIN. Machine-learning models that predict the binding affinity of a peptide-MHC pair are essential in peptide-based therapeutic design, but state-of-the-art methods provide point estimates of affinity that do not consider measurement noise and model uncertainty. We introduce PUFFIN, a method that quantifies the prediction uncertainty and prioritizes peptides with “binding likelihood” to achieve improved accuracy in high-affinity peptide selection for therapeutic design.