Real time classification of viruses in 12 dimensions.

The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label a...

Full description

Bibliographic Details
Main Authors: Chenglong Yu, Troy Hernandez, Hui Zheng, Shek-Chung Yau, Hsin-Hsiung Huang, Rong Lucy He, Jie Yang, Stephen S-T Yau
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3661469?pdf=render
_version_ 1819238199431528448
author Chenglong Yu
Troy Hernandez
Hui Zheng
Shek-Chung Yau
Hsin-Hsiung Huang
Rong Lucy He
Jie Yang
Stephen S-T Yau
author_facet Chenglong Yu
Troy Hernandez
Hui Zheng
Shek-Chung Yau
Hsin-Hsiung Huang
Rong Lucy He
Jie Yang
Stephen S-T Yau
author_sort Chenglong Yu
collection DOAJ
description The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in [Formula: see text]. Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in [Formula: see text]. Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels.
first_indexed 2024-12-23T13:32:26Z
format Article
id doaj.art-ecb7954478a9456294ecc5b3688b7f82
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-23T13:32:26Z
publishDate 2013-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-ecb7954478a9456294ecc5b3688b7f822022-12-21T17:45:07ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0185e6432810.1371/journal.pone.0064328Real time classification of viruses in 12 dimensions.Chenglong YuTroy HernandezHui ZhengShek-Chung YauHsin-Hsiung HuangRong Lucy HeJie YangStephen S-T YauThe International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in [Formula: see text]. Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in [Formula: see text]. Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels.http://europepmc.org/articles/PMC3661469?pdf=render
spellingShingle Chenglong Yu
Troy Hernandez
Hui Zheng
Shek-Chung Yau
Hsin-Hsiung Huang
Rong Lucy He
Jie Yang
Stephen S-T Yau
Real time classification of viruses in 12 dimensions.
PLoS ONE
title Real time classification of viruses in 12 dimensions.
title_full Real time classification of viruses in 12 dimensions.
title_fullStr Real time classification of viruses in 12 dimensions.
title_full_unstemmed Real time classification of viruses in 12 dimensions.
title_short Real time classification of viruses in 12 dimensions.
title_sort real time classification of viruses in 12 dimensions
url http://europepmc.org/articles/PMC3661469?pdf=render
work_keys_str_mv AT chenglongyu realtimeclassificationofvirusesin12dimensions
AT troyhernandez realtimeclassificationofvirusesin12dimensions
AT huizheng realtimeclassificationofvirusesin12dimensions
AT shekchungyau realtimeclassificationofvirusesin12dimensions
AT hsinhsiunghuang realtimeclassificationofvirusesin12dimensions
AT ronglucyhe realtimeclassificationofvirusesin12dimensions
AT jieyang realtimeclassificationofvirusesin12dimensions
AT stephenstyau realtimeclassificationofvirusesin12dimensions