Summary: | <p>Existing machine learning algorithms predicting class I antigen presentation are fundamentally awed due to the nature of the immunopeptidomics data used for their training. These models, rather than predicting antigen abundance, primarily indicate the presence of antigens on the cell surface. In this thesis, we integrate machine learning with mechanistic modelling to develop an enhanced model of the class I antigen processing pathway.</p>
<p>We begin by constructing a probabilistic model of epitope and precursor production by the proteasome, making use of existing algorithms to predict cleavage sites. We then develop a novel predictor of TAP binding affinity, PanTAP, which outperforms existingmethodsandformsaccuratepredictionsacrossdi erentmammalianspecies. Following this, we use a similar approach to train a predictive model of ERAP1 enzyme kinetics, enabling us to simulate the trimming of any potential substrate by ERAP1. These models are subsequently used to extend a previously validated systems biology model of peptide loading to MHC-I. This mechanistic model is validated using a study of SIINFEKL precursor processing and presentation in wild-type and ERAP1 knockdown cell lines, enabling us to infer the role of cytosolic aminopeptidases in epitope generation.</p>
<p>Finally, we use the validated mechanistic model to develop a new Predictor Of immunogenic Epitopes using Mechanistic modelling (POEM), employing a logistic regression trained on a dataset of neoantigens of known immunogenicity. POEM demonstrates superior efficacy on the training set and an independent dataset of GBM neoantigens. Furthermore, POEM accurately predicts the immunogenicity of pathogenic epitopes using a combined dataset from the IEDB, with its performance further validated through analysis of SARS-CoV-2 peptides across various HLA allo- types. Insights suggest that integrating source protein expression data could enhance POEM's predictions.</p>
<p>We conclude the thesis with a discussion of the results within the context of immunotherapy development and ideas for how our analysis may be further improved to provide clinical utility.</p>
|