<it>De novo</it> assembly of highly diverse viral populations

<p>Abstract</p> <p>Background</p> <p>Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A <it>de novo</it> population consen...

Full description

Bibliographic Details
Main Authors: Yang Xiao, Charlebois Patrick, Gnerre Sante, Coole Matthew G, Lennon Niall J, Levin Joshua Z, Qu James, Ryan Elizabeth M, Zody Michael C, Henn Matthew R
Format: Article
Language:English
Published: BMC 2012-09-01
Series:BMC Genomics
Online Access:http://www.biomedcentral.com/1471-2164/13/475
_version_ 1818049905675993088
author Yang Xiao
Charlebois Patrick
Gnerre Sante
Coole Matthew G
Lennon Niall J
Levin Joshua Z
Qu James
Ryan Elizabeth M
Zody Michael C
Henn Matthew R
author_facet Yang Xiao
Charlebois Patrick
Gnerre Sante
Coole Matthew G
Lennon Niall J
Levin Joshua Z
Qu James
Ryan Elizabeth M
Zody Michael C
Henn Matthew R
author_sort Yang Xiao
collection DOAJ
description <p>Abstract</p> <p>Background</p> <p>Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A <it>de novo</it> population consensus assembly is valuable both as a single linear representation of the population and as a backbone on which intra-host variants can be accurately mapped. The availability of consensus assemblies and robustly mapped variants are crucial to the genetic study of viral disease progression, transmission dynamics, and viral evolution. Existing <it>de novo</it> assembly techniques fail to robustly assemble ultra-deep sequence data from genetically heterogeneous populations such as viruses into full-length genomes due to the presence of extensive genetic variability, contaminants, and variable sequence coverage.</p> <p>Results</p> <p>We present <it>VICUNA</it>, a <it>de novo</it> assembly algorithm suitable for generating consensus assemblies from genetically heterogeneous populations. We demonstrate its effectiveness on Dengue, Human Immunodeficiency and West Nile viral populations, representing a range of intra-host diversity. Compared to state-of-the-art assemblers designed for haploid or diploid systems, <it>VICUNA</it> recovers full-length consensus and captures insertion/deletion polymorphisms in diverse samples. Final assemblies maintain a high base calling accuracy. <it>VICUNA</it> program is publicly available at: <url>http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/ viral-genomics-analysis-software</url>.</p> <p>Conclusions</p> <p>We developed <it>VICUNA</it>, a publicly available software tool, that enables consensus assembly of ultra-deep sequence derived from diverse viral populations. While <it>VICUNA</it> was developed for the analysis of viral populations, its application to other heterogeneous sequence data sets such as metagenomic or tumor cell population samples may prove beneficial in these fields of research.</p>
first_indexed 2024-12-10T10:45:00Z
format Article
id doaj.art-0403d0e217dc4c69a42504e055c4ca55
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-10T10:45:00Z
publishDate 2012-09-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-0403d0e217dc4c69a42504e055c4ca552022-12-22T01:52:11ZengBMCBMC Genomics1471-21642012-09-0113147510.1186/1471-2164-13-475<it>De novo</it> assembly of highly diverse viral populationsYang XiaoCharlebois PatrickGnerre SanteCoole Matthew GLennon Niall JLevin Joshua ZQu JamesRyan Elizabeth MZody Michael CHenn Matthew R<p>Abstract</p> <p>Background</p> <p>Extensive genetic diversity in viral populations within infected hosts and the divergence of variants from existing reference genomes impede the analysis of deep viral sequencing data. A <it>de novo</it> population consensus assembly is valuable both as a single linear representation of the population and as a backbone on which intra-host variants can be accurately mapped. The availability of consensus assemblies and robustly mapped variants are crucial to the genetic study of viral disease progression, transmission dynamics, and viral evolution. Existing <it>de novo</it> assembly techniques fail to robustly assemble ultra-deep sequence data from genetically heterogeneous populations such as viruses into full-length genomes due to the presence of extensive genetic variability, contaminants, and variable sequence coverage.</p> <p>Results</p> <p>We present <it>VICUNA</it>, a <it>de novo</it> assembly algorithm suitable for generating consensus assemblies from genetically heterogeneous populations. We demonstrate its effectiveness on Dengue, Human Immunodeficiency and West Nile viral populations, representing a range of intra-host diversity. Compared to state-of-the-art assemblers designed for haploid or diploid systems, <it>VICUNA</it> recovers full-length consensus and captures insertion/deletion polymorphisms in diverse samples. Final assemblies maintain a high base calling accuracy. <it>VICUNA</it> program is publicly available at: <url>http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/ viral-genomics-analysis-software</url>.</p> <p>Conclusions</p> <p>We developed <it>VICUNA</it>, a publicly available software tool, that enables consensus assembly of ultra-deep sequence derived from diverse viral populations. While <it>VICUNA</it> was developed for the analysis of viral populations, its application to other heterogeneous sequence data sets such as metagenomic or tumor cell population samples may prove beneficial in these fields of research.</p>http://www.biomedcentral.com/1471-2164/13/475
spellingShingle Yang Xiao
Charlebois Patrick
Gnerre Sante
Coole Matthew G
Lennon Niall J
Levin Joshua Z
Qu James
Ryan Elizabeth M
Zody Michael C
Henn Matthew R
<it>De novo</it> assembly of highly diverse viral populations
BMC Genomics
title <it>De novo</it> assembly of highly diverse viral populations
title_full <it>De novo</it> assembly of highly diverse viral populations
title_fullStr <it>De novo</it> assembly of highly diverse viral populations
title_full_unstemmed <it>De novo</it> assembly of highly diverse viral populations
title_short <it>De novo</it> assembly of highly diverse viral populations
title_sort it de novo it assembly of highly diverse viral populations
url http://www.biomedcentral.com/1471-2164/13/475
work_keys_str_mv AT yangxiao itdenovoitassemblyofhighlydiverseviralpopulations
AT charleboispatrick itdenovoitassemblyofhighlydiverseviralpopulations
AT gnerresante itdenovoitassemblyofhighlydiverseviralpopulations
AT coolematthewg itdenovoitassemblyofhighlydiverseviralpopulations
AT lennonniallj itdenovoitassemblyofhighlydiverseviralpopulations
AT levinjoshuaz itdenovoitassemblyofhighlydiverseviralpopulations
AT qujames itdenovoitassemblyofhighlydiverseviralpopulations
AT ryanelizabethm itdenovoitassemblyofhighlydiverseviralpopulations
AT zodymichaelc itdenovoitassemblyofhighlydiverseviralpopulations
AT hennmatthewr itdenovoitassemblyofhighlydiverseviralpopulations