Real time classification of viruses in 12 dimensions.
The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label a...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2013-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3661469?pdf=render |
_version_ | 1819238199431528448 |
---|---|
author | Chenglong Yu Troy Hernandez Hui Zheng Shek-Chung Yau Hsin-Hsiung Huang Rong Lucy He Jie Yang Stephen S-T Yau |
author_facet | Chenglong Yu Troy Hernandez Hui Zheng Shek-Chung Yau Hsin-Hsiung Huang Rong Lucy He Jie Yang Stephen S-T Yau |
author_sort | Chenglong Yu |
collection | DOAJ |
description | The International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in [Formula: see text]. Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in [Formula: see text]. Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels. |
first_indexed | 2024-12-23T13:32:26Z |
format | Article |
id | doaj.art-ecb7954478a9456294ecc5b3688b7f82 |
institution | Directory Open Access Journal |
issn | 1932-6203 |
language | English |
last_indexed | 2024-12-23T13:32:26Z |
publishDate | 2013-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS ONE |
spelling | doaj.art-ecb7954478a9456294ecc5b3688b7f822022-12-21T17:45:07ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0185e6432810.1371/journal.pone.0064328Real time classification of viruses in 12 dimensions.Chenglong YuTroy HernandezHui ZhengShek-Chung YauHsin-Hsiung HuangRong Lucy HeJie YangStephen S-T YauThe International Committee on Taxonomy of Viruses authorizes and organizes the taxonomic classification of viruses. Thus far, the detailed classifications for all viruses are neither complete nor free from dispute. For example, the current missing label rates in GenBank are 12.1% for family label and 30.0% for genus label. Using the proposed Natural Vector representation, all 2,044 single-segment referenced viral genomes in GenBank can be embedded in [Formula: see text]. Unlike other approaches, this allows us to determine phylogenetic relations for all viruses at any level (e.g., Baltimore class, family, subfamily, genus, and species) in real time. Additionally, the proposed graphical representation for virus phylogeny provides a visualization of the distribution of viruses in [Formula: see text]. Unlike the commonly used tree visualization methods which suffer from uniqueness and existence problems, our representation always exists and is unique. This approach is successfully used to predict and correct viral classification information, as well as to identify viral origins; e.g. a recent public health threat, the West Nile virus, is closer to the Japanese encephalitis antigenic complex based on our visualization. Based on cross-validation results, the accuracy rates of our predictions are as high as 98.2% for Baltimore class labels, 96.6% for family labels, 99.7% for subfamily labels and 97.2% for genus labels.http://europepmc.org/articles/PMC3661469?pdf=render |
spellingShingle | Chenglong Yu Troy Hernandez Hui Zheng Shek-Chung Yau Hsin-Hsiung Huang Rong Lucy He Jie Yang Stephen S-T Yau Real time classification of viruses in 12 dimensions. PLoS ONE |
title | Real time classification of viruses in 12 dimensions. |
title_full | Real time classification of viruses in 12 dimensions. |
title_fullStr | Real time classification of viruses in 12 dimensions. |
title_full_unstemmed | Real time classification of viruses in 12 dimensions. |
title_short | Real time classification of viruses in 12 dimensions. |
title_sort | real time classification of viruses in 12 dimensions |
url | http://europepmc.org/articles/PMC3661469?pdf=render |
work_keys_str_mv | AT chenglongyu realtimeclassificationofvirusesin12dimensions AT troyhernandez realtimeclassificationofvirusesin12dimensions AT huizheng realtimeclassificationofvirusesin12dimensions AT shekchungyau realtimeclassificationofvirusesin12dimensions AT hsinhsiunghuang realtimeclassificationofvirusesin12dimensions AT ronglucyhe realtimeclassificationofvirusesin12dimensions AT jieyang realtimeclassificationofvirusesin12dimensions AT stephenstyau realtimeclassificationofvirusesin12dimensions |