Modelling genetic variations with fragmentation-coagulation processes

We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. An FCP is exchangeable, projective, stationary and r...

Full description

Bibliographic Details
Main Authors: Teh, Y, Blundell, C, Elliott, LT
Format: Journal article
Language:English
Published: 2011
_version_ 1797087736775049216
author Teh, Y
Blundell, C
Elliott, LT
author_facet Teh, Y
Blundell, C
Elliott, LT
author_sort Teh, Y
collection OXFORD
description We propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As opposed to hidden Markov models, FCPs allow for flexible modelling of the number of clusters, and they avoid label switching non-identifiability problems. We develop an efficient Gibbs sampler for FCPs which uses uniformization and the forward-backward algorithm. Our development of FCPs is motivated by applications in population genetics, and we demonstrate the utility of FCPs on problems of genotype imputation with phased and unphased SNP data.
first_indexed 2024-03-07T02:39:56Z
format Journal article
id oxford-uuid:aa1409f2-644a-4e75-b3fd-6176cd9cbeaf
institution University of Oxford
language English
last_indexed 2024-03-07T02:39:56Z
publishDate 2011
record_format dspace
spelling oxford-uuid:aa1409f2-644a-4e75-b3fd-6176cd9cbeaf2022-03-27T03:12:50ZModelling genetic variations with fragmentation-coagulation processesJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:aa1409f2-644a-4e75-b3fd-6176cd9cbeafEnglishSymplectic Elements at Oxford2011Teh, YBlundell, CElliott, LTWe propose a novel class of Bayesian nonparametric models for sequential data called fragmentation-coagulation processes (FCPs). FCPs model a set of sequences using a partition-valued Markov process which evolves by splitting and merging clusters. An FCP is exchangeable, projective, stationary and reversible, and its equilibrium distributions are given by the Chinese restaurant process. As opposed to hidden Markov models, FCPs allow for flexible modelling of the number of clusters, and they avoid label switching non-identifiability problems. We develop an efficient Gibbs sampler for FCPs which uses uniformization and the forward-backward algorithm. Our development of FCPs is motivated by applications in population genetics, and we demonstrate the utility of FCPs on problems of genotype imputation with phased and unphased SNP data.
spellingShingle Teh, Y
Blundell, C
Elliott, LT
Modelling genetic variations with fragmentation-coagulation processes
title Modelling genetic variations with fragmentation-coagulation processes
title_full Modelling genetic variations with fragmentation-coagulation processes
title_fullStr Modelling genetic variations with fragmentation-coagulation processes
title_full_unstemmed Modelling genetic variations with fragmentation-coagulation processes
title_short Modelling genetic variations with fragmentation-coagulation processes
title_sort modelling genetic variations with fragmentation coagulation processes
work_keys_str_mv AT tehy modellinggeneticvariationswithfragmentationcoagulationprocesses
AT blundellc modellinggeneticvariationswithfragmentationcoagulationprocesses
AT elliottlt modellinggeneticvariationswithfragmentationcoagulationprocesses