Probabilistic modelling of genomic trajectories
<p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em>...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
2017
|
Subjects: |
_version_ | 1826315942905249792 |
---|---|
author | Campbell, K |
author2 | Yau, C |
author_facet | Yau, C Campbell, K |
author_sort | Campbell, K |
collection | OXFORD |
description | <p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em> through such data that track the development of some biological process. Such methods may be necessary due to the lack of explicit time series measurements or due to asynchronicity of the biological process at a given time.</p> <p>This thesis aims to recast trajectory inference from high-dimensional "omics" data as a statistical latent variable problem. We begin by examining sources of uncertainty in current approaches and examine the consequences of propagating such uncertainty to downstream analyses. We also introduce a model of switch-like differentiation along trajectories. Next, we consider inferring such trajectories through parametric nonlinear factor analysis models and demonstrate that incorporating information about gene behaviour as informative Bayesian priors improves inference. We then consider the case of bifurcations in data and demonstrate the extent to which they may be modelled using a hierarchical mixture of factor analysers. Finally, we propose a novel type of latent variable model that performs inference of such trajectories in the presence of heterogeneous genetic and environmental backgrounds. We apply this to both single-cell and population-level cancer datasets and propose a nonparametric extension similar to Gaussian Process Latent Variable Models.</p> |
first_indexed | 2024-03-06T19:53:57Z |
format | Thesis |
id | oxford-uuid:24e6704c-8a7f-4967-9fcd-95d6034eab39 |
institution | University of Oxford |
last_indexed | 2024-12-09T03:35:17Z |
publishDate | 2017 |
record_format | dspace |
spelling | oxford-uuid:24e6704c-8a7f-4967-9fcd-95d6034eab392024-12-01T18:51:33ZProbabilistic modelling of genomic trajectoriesThesishttp://purl.org/coar/resource_type/c_db06uuid:24e6704c-8a7f-4967-9fcd-95d6034eab39Genomics--Statistical methodsBioinformaticsORA Deposit2017Campbell, KYau, CWebber, C<p>The recent advancement of whole-transcriptome gene expression quantification technology - particularly at the single-cell level - has created a wealth of biological data. An increasingly popular unsupervised analysis is to find one dimensional manifolds or <em>trajectories</em> through such data that track the development of some biological process. Such methods may be necessary due to the lack of explicit time series measurements or due to asynchronicity of the biological process at a given time.</p> <p>This thesis aims to recast trajectory inference from high-dimensional "omics" data as a statistical latent variable problem. We begin by examining sources of uncertainty in current approaches and examine the consequences of propagating such uncertainty to downstream analyses. We also introduce a model of switch-like differentiation along trajectories. Next, we consider inferring such trajectories through parametric nonlinear factor analysis models and demonstrate that incorporating information about gene behaviour as informative Bayesian priors improves inference. We then consider the case of bifurcations in data and demonstrate the extent to which they may be modelled using a hierarchical mixture of factor analysers. Finally, we propose a novel type of latent variable model that performs inference of such trajectories in the presence of heterogeneous genetic and environmental backgrounds. We apply this to both single-cell and population-level cancer datasets and propose a nonparametric extension similar to Gaussian Process Latent Variable Models.</p> |
spellingShingle | Genomics--Statistical methods Bioinformatics Campbell, K Probabilistic modelling of genomic trajectories |
title | Probabilistic modelling of genomic trajectories |
title_full | Probabilistic modelling of genomic trajectories |
title_fullStr | Probabilistic modelling of genomic trajectories |
title_full_unstemmed | Probabilistic modelling of genomic trajectories |
title_short | Probabilistic modelling of genomic trajectories |
title_sort | probabilistic modelling of genomic trajectories |
topic | Genomics--Statistical methods Bioinformatics |
work_keys_str_mv | AT campbellk probabilisticmodellingofgenomictrajectories |