Probabilistic modelling of genomic trajectories

<p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em>...

Full description

Bibliographic Details
Main Author: Campbell, K
Other Authors: Yau, C
Format: Thesis
Published: 2017
Subjects:
_version_ 1826315942905249792
author Campbell, K
author2 Yau, C
author_facet Yau, C
Campbell, K
author_sort Campbell, K
collection OXFORD
description <p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em> through such data that track the development of some biological process. Such methods may be necessary due to the lack of explicit time series measurements or due to asynchronicity of the biological process at a given time.</p> <p>This thesis aims to recast trajectory inference from high-dimensional "omics" data as a statistical latent variable problem. We begin by examining sources of uncertainty in current approaches and examine the consequences of propagating such uncertainty to downstream analyses. We also introduce a model of switch-like differentiation along trajectories. Next, we consider inferring such trajectories through parametric nonlinear factor analysis models and demonstrate that incorporating information about gene behaviour as informative Bayesian priors improves inference. We then consider the case of bifurcations in data and demonstrate the extent to which they may be modelled using a hierarchical mixture of factor analysers. Finally, we propose a novel type of latent variable model that performs inference of such trajectories in the presence of heterogeneous genetic and environmental backgrounds. We apply this to both single-cell and population-level cancer datasets and propose a nonparametric extension similar to Gaussian Process Latent Variable Models.</p>
first_indexed 2024-03-06T19:53:57Z
format Thesis
id oxford-uuid:24e6704c-8a7f-4967-9fcd-95d6034eab39
institution University of Oxford
last_indexed 2024-12-09T03:35:17Z
publishDate 2017
record_format dspace
spelling oxford-uuid:24e6704c-8a7f-4967-9fcd-95d6034eab392024-12-01T18:51:33ZProbabilistic modelling of genomic trajectoriesThesishttp://purl.org/coar/resource_type/c_db06uuid:24e6704c-8a7f-4967-9fcd-95d6034eab39Genomics--Statistical methodsBioinformaticsORA Deposit2017Campbell, KYau, CWebber, C<p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em> through such data that track the development of some biological process. Such methods may be necessary due to the lack of explicit time series measurements or due to asynchronicity of the biological process at a given time.</p> <p>This thesis aims to recast trajectory inference from high-dimensional "omics" data as a statistical latent variable problem. We begin by examining sources of uncertainty in current approaches and examine the consequences of propagating such uncertainty to downstream analyses. We also introduce a model of switch-like differentiation along trajectories. Next, we consider inferring such trajectories through parametric nonlinear factor analysis models and demonstrate that incorporating information about gene behaviour as informative Bayesian priors improves inference. We then consider the case of bifurcations in data and demonstrate the extent to which they may be modelled using a hierarchical mixture of factor analysers. Finally, we propose a novel type of latent variable model that performs inference of such trajectories in the presence of heterogeneous genetic and environmental backgrounds. We apply this to both single-cell and population-level cancer datasets and propose a nonparametric extension similar to Gaussian Process Latent Variable Models.</p>
spellingShingle Genomics--Statistical methods
Bioinformatics
Campbell, K
Probabilistic modelling of genomic trajectories
title Probabilistic modelling of genomic trajectories
title_full Probabilistic modelling of genomic trajectories
title_fullStr Probabilistic modelling of genomic trajectories
title_full_unstemmed Probabilistic modelling of genomic trajectories
title_short Probabilistic modelling of genomic trajectories
title_sort probabilistic modelling of genomic trajectories
topic Genomics--Statistical methods
Bioinformatics
work_keys_str_mv AT campbellk probabilisticmodellingofgenomictrajectories