Comparison of methods for estimating the nucleotide substitution matrix
<p>Abstract</p> <p>Background</p> <p>The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence difference...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2008-12-01
|
Series: | BMC Bioinformatics |
Online Access: | http://www.biomedcentral.com/1471-2105/9/511 |
_version_ | 1818677547847647232 |
---|---|
author | Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob |
author_facet | Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob |
author_sort | Huttley Gavin A |
collection | DOAJ |
description | <p>Abstract</p> <p>Background</p> <p>The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence differences and taking the log of the resulting probability matrices, methods based on Markov triples, and maximum likelihood methods that infer the substitution probabilities that lead to the most likely model of evolution. However, the speed and accuracy of these methods has not been compared.</p> <p>Results</p> <p>Different methods differ in performance by orders of magnitude (ranging from 1 ms to 10 s per matrix), but differences in accuracy of rate matrix reconstruction appear to be relatively small. Encouragingly, relatively simple and fast methods can provide results at least as accurate as far more complex and computationally intensive methods, especially when the sequences to be compared are relatively short.</p> <p>Conclusion</p> <p>Based on the conditions tested, we recommend the use of method of Gojobori <it>et al</it>. (1982) for long sequences (> 600 nucleotides), and the method of Goldman <it>et al</it>. (1996) for shorter sequences (< 600 nucleotides). The method of Barry and Hartigan (1987) can provide somewhat more accuracy, measured as the Euclidean distance between the true and inferred matrices, on long sequences (> 2000 nucleotides) at the expense of substantially longer computation time. The availability of methods that are both fast and accurate will allow us to gain a global picture of change in the nucleotide substitution rate matrix on a genomewide scale across the tree of life.</p> |
first_indexed | 2024-12-17T09:01:07Z |
format | Article |
id | doaj.art-78633f65f64341e7bbde8cbe88c59909 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-12-17T09:01:07Z |
publishDate | 2008-12-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-78633f65f64341e7bbde8cbe88c599092022-12-21T21:55:42ZengBMCBMC Bioinformatics1471-21052008-12-019151110.1186/1471-2105-9-511Comparison of methods for estimating the nucleotide substitution matrixHuttley Gavin AYap Von BingMcDonald DanielOscamou MaribethLladser Manuel EKnight Rob<p>Abstract</p> <p>Background</p> <p>The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence differences and taking the log of the resulting probability matrices, methods based on Markov triples, and maximum likelihood methods that infer the substitution probabilities that lead to the most likely model of evolution. However, the speed and accuracy of these methods has not been compared.</p> <p>Results</p> <p>Different methods differ in performance by orders of magnitude (ranging from 1 ms to 10 s per matrix), but differences in accuracy of rate matrix reconstruction appear to be relatively small. Encouragingly, relatively simple and fast methods can provide results at least as accurate as far more complex and computationally intensive methods, especially when the sequences to be compared are relatively short.</p> <p>Conclusion</p> <p>Based on the conditions tested, we recommend the use of method of Gojobori <it>et al</it>. (1982) for long sequences (> 600 nucleotides), and the method of Goldman <it>et al</it>. (1996) for shorter sequences (< 600 nucleotides). The method of Barry and Hartigan (1987) can provide somewhat more accuracy, measured as the Euclidean distance between the true and inferred matrices, on long sequences (> 2000 nucleotides) at the expense of substantially longer computation time. The availability of methods that are both fast and accurate will allow us to gain a global picture of change in the nucleotide substitution rate matrix on a genomewide scale across the tree of life.</p>http://www.biomedcentral.com/1471-2105/9/511 |
spellingShingle | Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob Comparison of methods for estimating the nucleotide substitution matrix BMC Bioinformatics |
title | Comparison of methods for estimating the nucleotide substitution matrix |
title_full | Comparison of methods for estimating the nucleotide substitution matrix |
title_fullStr | Comparison of methods for estimating the nucleotide substitution matrix |
title_full_unstemmed | Comparison of methods for estimating the nucleotide substitution matrix |
title_short | Comparison of methods for estimating the nucleotide substitution matrix |
title_sort | comparison of methods for estimating the nucleotide substitution matrix |
url | http://www.biomedcentral.com/1471-2105/9/511 |
work_keys_str_mv | AT huttleygavina comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT yapvonbing comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT mcdonalddaniel comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT oscamoumaribeth comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT lladsermanuele comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT knightrob comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix |