Detecting natural selection in RNA virus populations using sequence summary statistics

At present, most analyses that aim to detect the action of natural selection upon viral gene sequences use phylogenetic estimates of the ratio of silent to replacement mutations. Such methods, however, are impractical to compute on large data sets comprising hundreds of viral genomes, which are beco...

Full beskrivning

Bibliografiska uppgifter
Huvudupphovsmän: Bhatt, S, Katzourakis, A, Pybus, O
Materialtyp: Journal article
Språk:English
Publicerad: Elsevier 2010
Ämnen:
_version_ 1826293927449198592
author Bhatt, S
Katzourakis, A
Pybus, O
author_facet Bhatt, S
Katzourakis, A
Pybus, O
author_sort Bhatt, S
collection OXFORD
description At present, most analyses that aim to detect the action of natural selection upon viral gene sequences use phylogenetic estimates of the ratio of silent to replacement mutations. Such methods, however, are impractical to compute on large data sets comprising hundreds of viral genomes, which are becoming increasingly common due to advances in genomes sequencing technology. Here we investigate the statistical performance of computationally efficient tests that are based on sequence summary statistics, and explore their applicability to RNA virus data sets in two ways. Firstly, we perform extensive simulations in order to measure the type 1 error of two well-known summary statistic methods - Tajima's D and the McDonald-Kreitman test - under a range of virus-like mutational and demographic scenarios. Secondly, we apply these methods to a compilation of ~ 100 RNA virus alignments that represent natural RNA virus populations. In addition, we develop and introduce a new implementation of the McDonald-Kreitman test and show that it greatly improves the test's statistical reliability on typical viral data sets. Our results suggest that variants of the McDonald-Kreitman test could prove useful in the analysis of very large sets of highly diverse viral genetic data.
first_indexed 2024-03-07T03:37:44Z
format Journal article
id oxford-uuid:bcd73413-e3e5-4812-9c00-53339e2b4b3c
institution University of Oxford
language English
last_indexed 2024-03-07T03:37:44Z
publishDate 2010
publisher Elsevier
record_format dspace
spelling oxford-uuid:bcd73413-e3e5-4812-9c00-53339e2b4b3c2022-03-27T05:27:32ZDetecting natural selection in RNA virus populations using sequence summary statisticsJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:bcd73413-e3e5-4812-9c00-53339e2b4b3cZoological sciencesEnglishOxford University Research Archive - ValetElsevier2010Bhatt, SKatzourakis, APybus, OAt present, most analyses that aim to detect the action of natural selection upon viral gene sequences use phylogenetic estimates of the ratio of silent to replacement mutations. Such methods, however, are impractical to compute on large data sets comprising hundreds of viral genomes, which are becoming increasingly common due to advances in genomes sequencing technology. Here we investigate the statistical performance of computationally efficient tests that are based on sequence summary statistics, and explore their applicability to RNA virus data sets in two ways. Firstly, we perform extensive simulations in order to measure the type 1 error of two well-known summary statistic methods - Tajima's D and the McDonald-Kreitman test - under a range of virus-like mutational and demographic scenarios. Secondly, we apply these methods to a compilation of ~ 100 RNA virus alignments that represent natural RNA virus populations. In addition, we develop and introduce a new implementation of the McDonald-Kreitman test and show that it greatly improves the test's statistical reliability on typical viral data sets. Our results suggest that variants of the McDonald-Kreitman test could prove useful in the analysis of very large sets of highly diverse viral genetic data.
spellingShingle Zoological sciences
Bhatt, S
Katzourakis, A
Pybus, O
Detecting natural selection in RNA virus populations using sequence summary statistics
title Detecting natural selection in RNA virus populations using sequence summary statistics
title_full Detecting natural selection in RNA virus populations using sequence summary statistics
title_fullStr Detecting natural selection in RNA virus populations using sequence summary statistics
title_full_unstemmed Detecting natural selection in RNA virus populations using sequence summary statistics
title_short Detecting natural selection in RNA virus populations using sequence summary statistics
title_sort detecting natural selection in rna virus populations using sequence summary statistics
topic Zoological sciences
work_keys_str_mv AT bhatts detectingnaturalselectioninrnaviruspopulationsusingsequencesummarystatistics
AT katzourakisa detectingnaturalselectioninrnaviruspopulationsusingsequencesummarystatistics
AT pybuso detectingnaturalselectioninrnaviruspopulationsusingsequencesummarystatistics