Deep generative models for T cell receptor protein sequences
Probabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences fo...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
eLife Sciences Publications Ltd
2019-09-01
|
Series: | eLife |
Subjects: | |
Online Access: | https://elifesciences.org/articles/46935 |
_version_ | 1818019896252956672 |
---|---|
author | Kristian Davidsen Branden J Olson William S DeWitt III Jean Feng Elias Harkins Philip Bradley Frederick A Matsen IV |
author_facet | Kristian Davidsen Branden J Olson William S DeWitt III Jean Feng Elias Harkins Philip Bradley Frederick A Matsen IV |
author_sort | Kristian Davidsen |
collection | DOAJ |
description | Probabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences for stimulation via vaccination. Classically, these models are defined in terms of a probabilistic V(D)J recombination model which is sometimes combined with a selection model. In this paper we take a different approach, fitting variational autoencoder (VAE) models parameterized by deep neural networks to T cell receptor (TCR) repertoires. We show that simple VAE models can perform accurate cohort frequency estimation, learn the rules of VDJ recombination, and generalize well to unseen sequences. Further, we demonstrate that VAE-like models can distinguish between real sequences and sequences generated according to a recombination-selection model, and that many characteristics of VAE-generated sequences are similar to those of real sequences. |
first_indexed | 2024-04-14T07:58:48Z |
format | Article |
id | doaj.art-c830ffb8a0de4a7489ad9c4b44a47423 |
institution | Directory Open Access Journal |
issn | 2050-084X |
language | English |
last_indexed | 2024-04-14T07:58:48Z |
publishDate | 2019-09-01 |
publisher | eLife Sciences Publications Ltd |
record_format | Article |
series | eLife |
spelling | doaj.art-c830ffb8a0de4a7489ad9c4b44a474232022-12-22T02:04:58ZengeLife Sciences Publications LtdeLife2050-084X2019-09-01810.7554/eLife.46935Deep generative models for T cell receptor protein sequencesKristian Davidsen0https://orcid.org/0000-0002-3821-6902Branden J Olson1https://orcid.org/0000-0003-1951-8822William S DeWitt III2https://orcid.org/0000-0002-6802-9139Jean Feng3https://orcid.org/0000-0003-2041-3104Elias Harkins4Philip Bradley5https://orcid.org/0000-0002-0224-6464Frederick A Matsen IV6https://orcid.org/0000-0003-0607-6025University of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesUniversity of Washington, Seattle, United States; Fred Hutchinson Cancer Research Center, Seattle, United StatesProbabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences for stimulation via vaccination. Classically, these models are defined in terms of a probabilistic V(D)J recombination model which is sometimes combined with a selection model. In this paper we take a different approach, fitting variational autoencoder (VAE) models parameterized by deep neural networks to T cell receptor (TCR) repertoires. We show that simple VAE models can perform accurate cohort frequency estimation, learn the rules of VDJ recombination, and generalize well to unseen sequences. Further, we demonstrate that VAE-like models can distinguish between real sequences and sequences generated according to a recombination-selection model, and that many characteristics of VAE-generated sequences are similar to those of real sequences.https://elifesciences.org/articles/46935T cell receptorvariational autoencoderrepertoire modelingvaccineT cell expansion |
spellingShingle | Kristian Davidsen Branden J Olson William S DeWitt III Jean Feng Elias Harkins Philip Bradley Frederick A Matsen IV Deep generative models for T cell receptor protein sequences eLife T cell receptor variational autoencoder repertoire modeling vaccine T cell expansion |
title | Deep generative models for T cell receptor protein sequences |
title_full | Deep generative models for T cell receptor protein sequences |
title_fullStr | Deep generative models for T cell receptor protein sequences |
title_full_unstemmed | Deep generative models for T cell receptor protein sequences |
title_short | Deep generative models for T cell receptor protein sequences |
title_sort | deep generative models for t cell receptor protein sequences |
topic | T cell receptor variational autoencoder repertoire modeling vaccine T cell expansion |
url | https://elifesciences.org/articles/46935 |
work_keys_str_mv | AT kristiandavidsen deepgenerativemodelsfortcellreceptorproteinsequences AT brandenjolson deepgenerativemodelsfortcellreceptorproteinsequences AT williamsdewittiii deepgenerativemodelsfortcellreceptorproteinsequences AT jeanfeng deepgenerativemodelsfortcellreceptorproteinsequences AT eliasharkins deepgenerativemodelsfortcellreceptorproteinsequences AT philipbradley deepgenerativemodelsfortcellreceptorproteinsequences AT frederickamatseniv deepgenerativemodelsfortcellreceptorproteinsequences |