Inference of genetic relatedness between viral quasispecies from sequencing data

Abstract Background RNA viruses such as HCV and HIV mutate at extremely high rates, and as a result, they exist in infected hosts as populations of genetically related variants. Recent advances in sequencing technologies make possible to identify such populations at great depth. In particular, these...

Full description

Bibliographic Details
Main Authors: Olga Glebova, Sergey Knyazev, Andrew Melnyk, Alexander Artyomenko, Yury Khudyakov, Alex Zelikovsky, Pavel Skums
Format: Article
Language:English
Published: BMC 2017-12-01
Series:BMC Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12864-017-4274-5
_version_ 1818508735234965504
author Olga Glebova
Sergey Knyazev
Andrew Melnyk
Alexander Artyomenko
Yury Khudyakov
Alex Zelikovsky
Pavel Skums
author_facet Olga Glebova
Sergey Knyazev
Andrew Melnyk
Alexander Artyomenko
Yury Khudyakov
Alex Zelikovsky
Pavel Skums
author_sort Olga Glebova
collection DOAJ
description Abstract Background RNA viruses such as HCV and HIV mutate at extremely high rates, and as a result, they exist in infected hosts as populations of genetically related variants. Recent advances in sequencing technologies make possible to identify such populations at great depth. In particular, these technologies provide new opportunities for inference of relatedness between viral samples, identification of transmission clusters and sources of infection, which are crucial tasks for viral outbreaks investigations. Results We present (i) an evolutionary simulation algorithm Viral Outbreak InferenCE (VOICE) inferring genetic relatedness, (ii) an algorithm MinDistB detecting possible transmission using minimal distances between intra-host viral populations and sizes of their relative borders, and (iii) a non-parametric recursive clustering algorithm Relatedness Depth (ReD) analyzing clusters’ structure to infer possible transmissions and their directions. All proposed algorithms were validated using real sequencing data from HCV outbreaks. Conclusions All algorithms are applicable to the analysis of outbreaks of highly heterogeneous RNA viruses. Our experimental validation shows that they can successfully identify genetic relatedness between viral populations, as well as infer transmission clusters and outbreak sources.
first_indexed 2024-12-10T22:36:03Z
format Article
id doaj.art-5074477777af4cdb8a4abca9314772fe
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-10T22:36:03Z
publishDate 2017-12-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-5074477777af4cdb8a4abca9314772fe2022-12-22T01:30:53ZengBMCBMC Genomics1471-21642017-12-0118S10818810.1186/s12864-017-4274-5Inference of genetic relatedness between viral quasispecies from sequencing dataOlga Glebova0Sergey Knyazev1Andrew Melnyk2Alexander Artyomenko3Yury Khudyakov4Alex Zelikovsky5Pavel Skums6Computer Science Department, Georgia State UniversityComputer Science Department, Georgia State UniversityComputer Science Department, Georgia State UniversityComputer Science Department, Georgia State UniversityCenters for Disease Control and PreventionComputer Science Department, Georgia State UniversityComputer Science Department, Georgia State UniversityAbstract Background RNA viruses such as HCV and HIV mutate at extremely high rates, and as a result, they exist in infected hosts as populations of genetically related variants. Recent advances in sequencing technologies make possible to identify such populations at great depth. In particular, these technologies provide new opportunities for inference of relatedness between viral samples, identification of transmission clusters and sources of infection, which are crucial tasks for viral outbreaks investigations. Results We present (i) an evolutionary simulation algorithm Viral Outbreak InferenCE (VOICE) inferring genetic relatedness, (ii) an algorithm MinDistB detecting possible transmission using minimal distances between intra-host viral populations and sizes of their relative borders, and (iii) a non-parametric recursive clustering algorithm Relatedness Depth (ReD) analyzing clusters’ structure to infer possible transmissions and their directions. All proposed algorithms were validated using real sequencing data from HCV outbreaks. Conclusions All algorithms are applicable to the analysis of outbreaks of highly heterogeneous RNA viruses. Our experimental validation shows that they can successfully identify genetic relatedness between viral populations, as well as infer transmission clusters and outbreak sources.http://link.springer.com/article/10.1186/s12864-017-4274-5Genetic relatednessTransmission networksOutbreaks investigationsSimulationClustering
spellingShingle Olga Glebova
Sergey Knyazev
Andrew Melnyk
Alexander Artyomenko
Yury Khudyakov
Alex Zelikovsky
Pavel Skums
Inference of genetic relatedness between viral quasispecies from sequencing data
BMC Genomics
Genetic relatedness
Transmission networks
Outbreaks investigations
Simulation
Clustering
title Inference of genetic relatedness between viral quasispecies from sequencing data
title_full Inference of genetic relatedness between viral quasispecies from sequencing data
title_fullStr Inference of genetic relatedness between viral quasispecies from sequencing data
title_full_unstemmed Inference of genetic relatedness between viral quasispecies from sequencing data
title_short Inference of genetic relatedness between viral quasispecies from sequencing data
title_sort inference of genetic relatedness between viral quasispecies from sequencing data
topic Genetic relatedness
Transmission networks
Outbreaks investigations
Simulation
Clustering
url http://link.springer.com/article/10.1186/s12864-017-4274-5
work_keys_str_mv AT olgaglebova inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT sergeyknyazev inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT andrewmelnyk inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT alexanderartyomenko inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT yurykhudyakov inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT alexzelikovsky inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata
AT pavelskums inferenceofgeneticrelatednessbetweenviralquasispeciesfromsequencingdata