Classifying the Unclassified: A Phage Classification Method

This work reports the method ClassiPhage to classify phage genomes using sequence derived taxonomic features. ClassiPhage uses a set of phage specific Hidden Markov Models (HMMs) generated from clusters of related proteins. The method was validated on all publicly available genomes of phages that ar...

Full description

Bibliographic Details
Main Authors: Cynthia Maria Chibani, Anton Farr, Sandra Klama, Sascha Dietrich, Heiko Liesegang
Format: Article
Language:English
Published: MDPI AG 2019-02-01
Series:Viruses
Subjects:
Online Access:https://www.mdpi.com/1999-4915/11/2/195
Description
Summary:This work reports the method ClassiPhage to classify phage genomes using sequence derived taxonomic features. ClassiPhage uses a set of phage specific Hidden Markov Models (HMMs) generated from clusters of related proteins. The method was validated on all publicly available genomes of phages that are known to infect <i>Vibrionaceae</i>. The phages belong to the well-described phage families of <i>Myoviridae</i>, <i>Podoviridae</i>, <i>Siphoviridae</i>, and <i>Inoviridae</i>. The achieved classification is consistent with the assignments of the International Committee on Taxonomy of Viruses (ICTV), all tested phages were assigned to the corresponding group of the ICTV-database. In addition, 44 out of 58 genomes of <i>Vibrio</i> phages not yet classified could be assigned to a phage family. The remaining 14 genomes may represent phages of new families or subfamilies. Comparative genomics indicates that the ability of the approach to identify and classify phages is correlated to the conserved genomic organization. ClassiPhage classifies phages exclusively based on genome sequence data and can be applied on distinct phage genomes as well as on prophage regions within host genomes. Possible applications include (a) classifying phages from assembled metagenomes; and (b) the identification and classification of integrated prophages and the splitting of phage families into subfamilies.
ISSN:1999-4915