Comparison of methods for estimating the nucleotide substitution matrix

Abstract Background The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence difference...

Full description

Bibliographic Details
Main Authors:	Huttley Gavin A, Yap Von Bing, McDonald Daniel, Oscamou Maribeth, Lladser Manuel E, Knight Rob
Format:	Article
Language:	English
Published:	BMC 2008-12-01
Series:	BMC Bioinformatics
Online Access:	http://www.biomedcentral.com/1471-2105/9/511

_version_	1818677547847647232
author	Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob
author_facet	Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob
author_sort	Huttley Gavin A
collection	DOAJ
description	<p>Abstract</p> <p>Background</p> <p>The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence differences and taking the log of the resulting probability matrices, methods based on Markov triples, and maximum likelihood methods that infer the substitution probabilities that lead to the most likely model of evolution. However, the speed and accuracy of these methods has not been compared.</p> <p>Results</p> <p>Different methods differ in performance by orders of magnitude (ranging from 1 ms to 10 s per matrix), but differences in accuracy of rate matrix reconstruction appear to be relatively small. Encouragingly, relatively simple and fast methods can provide results at least as accurate as far more complex and computationally intensive methods, especially when the sequences to be compared are relatively short.</p> <p>Conclusion</p> <p>Based on the conditions tested, we recommend the use of method of Gojobori <it>et al</it>. (1982) for long sequences (> 600 nucleotides), and the method of Goldman <it>et al</it>. (1996) for shorter sequences (< 600 nucleotides). The method of Barry and Hartigan (1987) can provide somewhat more accuracy, measured as the Euclidean distance between the true and inferred matrices, on long sequences (> 2000 nucleotides) at the expense of substantially longer computation time. The availability of methods that are both fast and accurate will allow us to gain a global picture of change in the nucleotide substitution rate matrix on a genomewide scale across the tree of life.</p>
first_indexed	2024-12-17T09:01:07Z
format	Article
id	doaj.art-78633f65f64341e7bbde8cbe88c59909
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-12-17T09:01:07Z
publishDate	2008-12-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-78633f65f64341e7bbde8cbe88c599092022-12-21T21:55:42ZengBMCBMC Bioinformatics1471-21052008-12-019151110.1186/1471-2105-9-511Comparison of methods for estimating the nucleotide substitution matrixHuttley Gavin AYap Von BingMcDonald DanielOscamou MaribethLladser Manuel EKnight Rob<p>Abstract</p> <p>Background</p> <p>The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence differences and taking the log of the resulting probability matrices, methods based on Markov triples, and maximum likelihood methods that infer the substitution probabilities that lead to the most likely model of evolution. However, the speed and accuracy of these methods has not been compared.</p> <p>Results</p> <p>Different methods differ in performance by orders of magnitude (ranging from 1 ms to 10 s per matrix), but differences in accuracy of rate matrix reconstruction appear to be relatively small. Encouragingly, relatively simple and fast methods can provide results at least as accurate as far more complex and computationally intensive methods, especially when the sequences to be compared are relatively short.</p> <p>Conclusion</p> <p>Based on the conditions tested, we recommend the use of method of Gojobori <it>et al</it>. (1982) for long sequences (> 600 nucleotides), and the method of Goldman <it>et al</it>. (1996) for shorter sequences (< 600 nucleotides). The method of Barry and Hartigan (1987) can provide somewhat more accuracy, measured as the Euclidean distance between the true and inferred matrices, on long sequences (> 2000 nucleotides) at the expense of substantially longer computation time. The availability of methods that are both fast and accurate will allow us to gain a global picture of change in the nucleotide substitution rate matrix on a genomewide scale across the tree of life.</p>http://www.biomedcentral.com/1471-2105/9/511
spellingShingle	Huttley Gavin A Yap Von Bing McDonald Daniel Oscamou Maribeth Lladser Manuel E Knight Rob Comparison of methods for estimating the nucleotide substitution matrix BMC Bioinformatics
title	Comparison of methods for estimating the nucleotide substitution matrix
title_full	Comparison of methods for estimating the nucleotide substitution matrix
title_fullStr	Comparison of methods for estimating the nucleotide substitution matrix
title_full_unstemmed	Comparison of methods for estimating the nucleotide substitution matrix
title_short	Comparison of methods for estimating the nucleotide substitution matrix
title_sort	comparison of methods for estimating the nucleotide substitution matrix
url	http://www.biomedcentral.com/1471-2105/9/511
work_keys_str_mv	AT huttleygavina comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT yapvonbing comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT mcdonalddaniel comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT oscamoumaribeth comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT lladsermanuele comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix AT knightrob comparisonofmethodsforestimatingthenucleotidesubstitutionmatrix

Comparison of methods for estimating the nucleotide substitution matrix

Similar Items