VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses
Recent years have seen major changes in the classification criteria and taxonomy of viruses. The current classification scheme, also called “megataxonomy of viruses”, recognizes six different viral realms, defined based on the presence of viral hallmark genes (VHGs). Within the realms, viruses are c...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-04-01
|
Series: | Viruses |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-4915/15/4/1007 |
_version_ | 1797603149733691392 |
---|---|
author | Cristina Moraru |
author_facet | Cristina Moraru |
author_sort | Cristina Moraru |
collection | DOAJ |
description | Recent years have seen major changes in the classification criteria and taxonomy of viruses. The current classification scheme, also called “megataxonomy of viruses”, recognizes six different viral realms, defined based on the presence of viral hallmark genes (VHGs). Within the realms, viruses are classified into hierarchical taxons, ideally defined by the phylogeny of their shared genes. To enable the detection of shared genes, viruses have first to be clustered, and there is currently a need for tools to assist with virus clustering and classification. Here, VirClust is presented. It is a novel, reference-free tool capable of performing: (i) protein clustering, based on BLASTp and Hidden Markov Models (HMMs) similarities; (ii) hierarchical clustering of viruses based on intergenomic distances calculated from their shared protein content; (iii) identification of core proteins and (iv) annotation of viral proteins. VirClust has flexible parameters both for protein clustering and for splitting the viral genome tree into smaller genome clusters, corresponding to different taxonomic levels. Benchmarking on a phage dataset showed that the genome trees produced by VirClust match the current ICTV classification at family, sub-family and genus levels. VirClust is freely available, as a web-service and stand-alone tool. |
first_indexed | 2024-03-11T04:26:21Z |
format | Article |
id | doaj.art-a1bf6749c0dd455ea4e3fdb42b657fb7 |
institution | Directory Open Access Journal |
issn | 1999-4915 |
language | English |
last_indexed | 2024-03-11T04:26:21Z |
publishDate | 2023-04-01 |
publisher | MDPI AG |
record_format | Article |
series | Viruses |
spelling | doaj.art-a1bf6749c0dd455ea4e3fdb42b657fb72023-11-17T21:46:53ZengMDPI AGViruses1999-49152023-04-01154100710.3390/v15041007VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) VirusesCristina Moraru0Institute for Chemistry and Biology of the Marine Environment, Carl-von-Ossietzky–Str. 9-11, 26111 Oldenburg, GermanyRecent years have seen major changes in the classification criteria and taxonomy of viruses. The current classification scheme, also called “megataxonomy of viruses”, recognizes six different viral realms, defined based on the presence of viral hallmark genes (VHGs). Within the realms, viruses are classified into hierarchical taxons, ideally defined by the phylogeny of their shared genes. To enable the detection of shared genes, viruses have first to be clustered, and there is currently a need for tools to assist with virus clustering and classification. Here, VirClust is presented. It is a novel, reference-free tool capable of performing: (i) protein clustering, based on BLASTp and Hidden Markov Models (HMMs) similarities; (ii) hierarchical clustering of viruses based on intergenomic distances calculated from their shared protein content; (iii) identification of core proteins and (iv) annotation of viral proteins. VirClust has flexible parameters both for protein clustering and for splitting the viral genome tree into smaller genome clusters, corresponding to different taxonomic levels. Benchmarking on a phage dataset showed that the genome trees produced by VirClust match the current ICTV classification at family, sub-family and genus levels. VirClust is freely available, as a web-service and stand-alone tool.https://www.mdpi.com/1999-4915/15/4/1007VirClustvirus genome clusteringvirus protein clusteringvirus protein annotationvirus classificationphage classification |
spellingShingle | Cristina Moraru VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses Viruses VirClust virus genome clustering virus protein clustering virus protein annotation virus classification phage classification |
title | VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses |
title_full | VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses |
title_fullStr | VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses |
title_full_unstemmed | VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses |
title_short | VirClust—A Tool for Hierarchical Clustering, Core Protein Detection and Annotation of (<i>Prokaryotic</i>) Viruses |
title_sort | virclust a tool for hierarchical clustering core protein detection and annotation of i prokaryotic i viruses |
topic | VirClust virus genome clustering virus protein clustering virus protein annotation virus classification phage classification |
url | https://www.mdpi.com/1999-4915/15/4/1007 |
work_keys_str_mv | AT cristinamoraru virclustatoolforhierarchicalclusteringcoreproteindetectionandannotationofiprokaryoticiviruses |