Summary: | We present a novel algorithm, implemented in the software ARGinfer, for probabilistic inference of the
Ancestral Recombination Graph under the Coalescent with Recombination. Our Markov Chain Monte Carlo
algorithm takes advantage of the Succinct Tree Sequence data structure that has allowed great advances in
simulation and point estimation, but not yet probabilistic inference. Unlike previous methods, which employ
the Sequentially Markov Coalescent approximation, ARGinfer uses the Coalescent with Recombination,
allowing more accurate inference of key evolutionary parameters. We show using simulations that ARGinfer
can accurately estimate many properties of the evolutionary history of the sample, including the topology and
branch lengths of the genealogical tree at each sequence site, and the times and locations of mutation and
recombination events. ARGinfer approximates posterior probability distributions for these and other
quantities, providing interpretable assessments of uncertainty that we show to be well calibrated. ARGinfer is
currently limited to tens of DNA sequences of several hundreds of kilobases, but has scope for further
computational improvements to increase its applicability.
|