Virosaurus A Reference to Explore and Capture Virus Genetic Diversity

The huge genetic diversity of circulating viruses is a challenge for diagnostic assays for emerging or rare viral diseases. High-throughput technology offers a new opportunity to explore the global virome of patients without preconception about the culpable pathogens. It requires a solid reference d...

Full description

Bibliographic Details
Main Authors: Anne Gleizes, Florian Laubscher, Nicolas Guex, Christian Iseli, Thomas Junier, Samuel Cordey, Jacques Fellay, Ioannis Xenarios, Laurent Kaiser, Philippe Le Mercier
Format: Article
Language:English
Published: MDPI AG 2020-11-01
Series:Viruses
Subjects:
Online Access:https://www.mdpi.com/1999-4915/12/11/1248
_version_ 1797549086017060864
author Anne Gleizes
Florian Laubscher
Nicolas Guex
Christian Iseli
Thomas Junier
Samuel Cordey
Jacques Fellay
Ioannis Xenarios
Laurent Kaiser
Philippe Le Mercier
author_facet Anne Gleizes
Florian Laubscher
Nicolas Guex
Christian Iseli
Thomas Junier
Samuel Cordey
Jacques Fellay
Ioannis Xenarios
Laurent Kaiser
Philippe Le Mercier
author_sort Anne Gleizes
collection DOAJ
description The huge genetic diversity of circulating viruses is a challenge for diagnostic assays for emerging or rare viral diseases. High-throughput technology offers a new opportunity to explore the global virome of patients without preconception about the culpable pathogens. It requires a solid reference dataset to be accurate. Virosaurus has been designed to offer a non-biased, automatized and annotated database for clinical metagenomics studies and diagnosis. Raw viral sequences have been extracted from GenBank, and cleaned up to remove potentially erroneous sequences. Complete sequences have been identified for all genera infecting vertebrates, plants and other eukaryotes (insect, fungus, etc.). To facilitate the analysis of clinically relevant viruses, we have annotated all sequences with official and common virus names, acronym, genotypes, and genomic features (linear, circular, DNA, RNA, etc.). Sequences have been clustered to remove redundancy at 90% or 98% identity. The analysis of clustering results reveals the state of the virus genetic landscape knowledge. Because herpes and poxviruses were under-represented in complete genomes considering their potential diversity in nature, we used genes instead of complete genomes for those in Virosaurus.
first_indexed 2024-03-10T15:09:38Z
format Article
id doaj.art-18768f6cf35948759a93fbeb7c228c57
institution Directory Open Access Journal
issn 1999-4915
language English
last_indexed 2024-03-10T15:09:38Z
publishDate 2020-11-01
publisher MDPI AG
record_format Article
series Viruses
spelling doaj.art-18768f6cf35948759a93fbeb7c228c572023-11-20T19:25:52ZengMDPI AGViruses1999-49152020-11-011211124810.3390/v12111248Virosaurus A Reference to Explore and Capture Virus Genetic DiversityAnne Gleizes0Florian Laubscher1Nicolas Guex2Christian Iseli3Thomas Junier4Samuel Cordey5Jacques Fellay6Ioannis Xenarios7Laurent Kaiser8Philippe Le Mercier9Vital-IT Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, SwitzerlandDivision of Infectious Diseases, Geneva University Hospitals, 1205 Geneva, SwitzerlandBioinformatics Competence Center, University of Lausanne, 1015 Lausanne, SwitzerlandBioinformatics Competence Center, University of Lausanne, 1015 Lausanne, SwitzerlandVital-IT Group, SIB Swiss Institute of Bioinformatics, 1015 Lausanne, SwitzerlandDivision of Infectious Diseases, Geneva University Hospitals, 1205 Geneva, SwitzerlandUnité de Médecine de Précision, CHUV, 1015 Lausanne, SwitzerlandCenter for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, 1015 Lausanne, SwitzerlandDivision of Infectious Diseases, Geneva University Hospitals, 1205 Geneva, SwitzerlandSwiss-Prot Group, SIB Swiss Institute of Bioinformatics, 1011 Geneva, SwitzerlandThe huge genetic diversity of circulating viruses is a challenge for diagnostic assays for emerging or rare viral diseases. High-throughput technology offers a new opportunity to explore the global virome of patients without preconception about the culpable pathogens. It requires a solid reference dataset to be accurate. Virosaurus has been designed to offer a non-biased, automatized and annotated database for clinical metagenomics studies and diagnosis. Raw viral sequences have been extracted from GenBank, and cleaned up to remove potentially erroneous sequences. Complete sequences have been identified for all genera infecting vertebrates, plants and other eukaryotes (insect, fungus, etc.). To facilitate the analysis of clinically relevant viruses, we have annotated all sequences with official and common virus names, acronym, genotypes, and genomic features (linear, circular, DNA, RNA, etc.). Sequences have been clustered to remove redundancy at 90% or 98% identity. The analysis of clustering results reveals the state of the virus genetic landscape knowledge. Because herpes and poxviruses were under-represented in complete genomes considering their potential diversity in nature, we used genes instead of complete genomes for those in Virosaurus.https://www.mdpi.com/1999-4915/12/11/1248databasecomplete genomebioinformaticsHTSdiagnosticssequencing
spellingShingle Anne Gleizes
Florian Laubscher
Nicolas Guex
Christian Iseli
Thomas Junier
Samuel Cordey
Jacques Fellay
Ioannis Xenarios
Laurent Kaiser
Philippe Le Mercier
Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
Viruses
database
complete genome
bioinformatics
HTS
diagnostics
sequencing
title Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
title_full Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
title_fullStr Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
title_full_unstemmed Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
title_short Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
title_sort virosaurus a reference to explore and capture virus genetic diversity
topic database
complete genome
bioinformatics
HTS
diagnostics
sequencing
url https://www.mdpi.com/1999-4915/12/11/1248
work_keys_str_mv AT annegleizes virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT florianlaubscher virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT nicolasguex virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT christianiseli virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT thomasjunier virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT samuelcordey virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT jacquesfellay virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT ioannisxenarios virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT laurentkaiser virosaurusareferencetoexploreandcapturevirusgeneticdiversity
AT philippelemercier virosaurusareferencetoexploreandcapturevirusgeneticdiversity