Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.

The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for fi...

Full description

Bibliographic Details
Main Authors: Westesson, O, Lunter, G, Paten, B, Holmes, I
Format: Journal article
Language:English
Published: Public Library of Science 2012
_version_ 1797092901102026752
author Westesson, O
Lunter, G
Paten, B
Holmes, I
author_facet Westesson, O
Lunter, G
Paten, B
Holmes, I
author_sort Westesson, O
collection OXFORD
description The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.
first_indexed 2024-03-07T03:52:38Z
format Journal article
id oxford-uuid:c1cd3a63-2e20-4321-b779-666a9e71b7e0
institution University of Oxford
language English
last_indexed 2024-03-07T03:52:38Z
publishDate 2012
publisher Public Library of Science
record_format dspace
spelling oxford-uuid:c1cd3a63-2e20-4321-b779-666a9e71b7e02022-03-27T06:04:13ZAccurate reconstruction of insertion-deletion histories by statistical phylogenetics.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:c1cd3a63-2e20-4321-b779-666a9e71b7e0EnglishSymplectic Elements at OxfordPublic Library of Science2012Westesson, OLunter, GPaten, BHolmes, IThe Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.
spellingShingle Westesson, O
Lunter, G
Paten, B
Holmes, I
Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title_full Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title_fullStr Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title_full_unstemmed Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title_short Accurate reconstruction of insertion-deletion histories by statistical phylogenetics.
title_sort accurate reconstruction of insertion deletion histories by statistical phylogenetics
work_keys_str_mv AT westessono accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT lunterg accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT patenb accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics
AT holmesi accuratereconstructionofinsertiondeletionhistoriesbystatisticalphylogenetics