Exact Likelihood Calculation under the Infinite Sites Model

A key parameter in population genetics is the scaled mutation rate θ = 4 N μ , where N is the effective haploid population size and μ is the mutation rate per haplotype per...

Full description

Bibliographic Details
Main Authors: Muhammad Faisal, Andreas Futschik, Claus Vogl
Format: Article
Language:English
Published: MDPI AG 2015-12-01
Series:Computation
Subjects:
Online Access:http://www.mdpi.com/2079-3197/3/4/701
Description
Summary:A key parameter in population genetics is the scaled mutation rate θ = 4 N μ , where N is the effective haploid population size and μ is the mutation rate per haplotype per generation. While exact likelihood inference is notoriously difficult in population genetics, we propose a novel approach to compute a first order accurate likelihood of θ that is based on dynamic programming under the infinite sites model without recombination. The parameter θ may be either constant, i.e., time-independent, or time-dependent, which allows for changes of demography and deviations from neutral equilibrium. For time-independent θ, the performance is compared to the approach in Griffiths and Tavaré’s work “Simulating Probability Distributions in the Coalescent” (Theor. Popul. Biol. 1994, 46, 131–159) that is based on importance sampling and implemented in the “genetree” program. Roughly, the proposed method is computationally fast when n × θ < 100 , where n is the sample size. For time-dependent θ ( t ) , we analyze a simple demographic model with a single change in θ ( t ) . In this case, the ancestral and current θ need to be estimated, as well as the time of change. To our knowledge, this is the first accurate computation of a likelihood in the infinite sites model with non-equilibrium demography.
ISSN:2079-3197