PhageTailFinder: A tool for phage tail module detection and annotation

Decades of overconsumption of antimicrobials in the treatment and prevention of bacterial infections have resulted in the increasing emergence of drug-resistant bacteria, which poses a significant challenge to public health, driving the urgent need to find alternatives to conventional antibiotics. B...

Full description

Bibliographic Details
Main Authors: Fengxia Zhou, Han Yang, Yu Si, Rui Gan, Ling Yu, Chuangeng Chen, Chunyan Ren, Jiqiu Wu, Fan Zhang
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-01-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fgene.2023.947466/full
_version_ 1797945121916846080
author Fengxia Zhou
Han Yang
Yu Si
Rui Gan
Ling Yu
Chuangeng Chen
Chunyan Ren
Jiqiu Wu
Fan Zhang
Fan Zhang
author_facet Fengxia Zhou
Han Yang
Yu Si
Rui Gan
Ling Yu
Chuangeng Chen
Chunyan Ren
Jiqiu Wu
Fan Zhang
Fan Zhang
author_sort Fengxia Zhou
collection DOAJ
description Decades of overconsumption of antimicrobials in the treatment and prevention of bacterial infections have resulted in the increasing emergence of drug-resistant bacteria, which poses a significant challenge to public health, driving the urgent need to find alternatives to conventional antibiotics. Bacteriophages are viruses infecting specific bacterial hosts, often destroying the infected bacterial hosts. Phages attach to and enter their potential hosts using their tail proteins, with the composition of the tail determining the range of potentially infected bacteria. To aid the exploitation of bacteriophages for therapeutic purposes, we developed the PhageTailFinder algorithm to predict tail-related proteins and identify the putative tail module in previously uncharacterized phages. The PhageTailFinder relies on a two-state hidden Markov model (HMM) to predict the probability of a given protein being tail-related. The process takes into account the natural modularity of phage tail-related proteins, rather than simply considering amino acid properties or secondary structures for each protein in isolation. The PhageTailFinder exhibited robust predictive power for phage tail proteins in novel phages due to this sequence-independent operation. The performance of the prediction model was evaluated in 13 extensively studied phages and a sample of 992 complete phages from the NCBI database. The algorithm achieved a high true-positive prediction rate (>80%) in over half (571) of the studied phages, and the ROC value was 0.877 using general models and 0.968 using corresponding morphologic models. It is notable that the median ROC value of 992 complete phages is more than 0.75 even for novel phages, indicating the high accuracy and specificity of the PhageTailFinder. When applied to a dataset containing 189,680 viral genomes derived from 11,810 bulk metagenomic human stool samples, the ROC value was 0.895. In addition, tail protein clusters could be identified for further studies by density-based spatial clustering of applications with the noise algorithm (DBSCAN). The developed PhageTailFinder tool can be accessed either as a web server (http://www.microbiome-bigdata.com/PHISDetector/index/tools/PhageTailFinder) or as a stand-alone program on a standard desktop computer (https://github.com/HIT-ImmunologyLab/PhageTailFinder).
first_indexed 2024-04-10T20:50:17Z
format Article
id doaj.art-70e2768980d345808afd4470789a0f16
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-10T20:50:17Z
publishDate 2023-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-70e2768980d345808afd4470789a0f162023-01-23T15:41:54ZengFrontiers Media S.A.Frontiers in Genetics1664-80212023-01-011410.3389/fgene.2023.947466947466PhageTailFinder: A tool for phage tail module detection and annotationFengxia Zhou0Han Yang1Yu Si2Rui Gan3Ling Yu4Chuangeng Chen5Chunyan Ren6Jiqiu Wu7Fan Zhang8Fan Zhang9HIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaDepartment of Hematology, Department of Oncology, Boston Children’s Hospital, Harvard Medical School, Boston, MA, United StatesDepartment of Genetics, University Medical Center Groningen, University of Groningen, Groningen, NetherlandsHIT Center for Life Sciences, School of Life Science and Technology, Harbin Institute of Technology, Harbin, ChinaAnhui Province Key Laboratory of Medical Physics and Technology, Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, ChinaDecades of overconsumption of antimicrobials in the treatment and prevention of bacterial infections have resulted in the increasing emergence of drug-resistant bacteria, which poses a significant challenge to public health, driving the urgent need to find alternatives to conventional antibiotics. Bacteriophages are viruses infecting specific bacterial hosts, often destroying the infected bacterial hosts. Phages attach to and enter their potential hosts using their tail proteins, with the composition of the tail determining the range of potentially infected bacteria. To aid the exploitation of bacteriophages for therapeutic purposes, we developed the PhageTailFinder algorithm to predict tail-related proteins and identify the putative tail module in previously uncharacterized phages. The PhageTailFinder relies on a two-state hidden Markov model (HMM) to predict the probability of a given protein being tail-related. The process takes into account the natural modularity of phage tail-related proteins, rather than simply considering amino acid properties or secondary structures for each protein in isolation. The PhageTailFinder exhibited robust predictive power for phage tail proteins in novel phages due to this sequence-independent operation. The performance of the prediction model was evaluated in 13 extensively studied phages and a sample of 992 complete phages from the NCBI database. The algorithm achieved a high true-positive prediction rate (>80%) in over half (571) of the studied phages, and the ROC value was 0.877 using general models and 0.968 using corresponding morphologic models. It is notable that the median ROC value of 992 complete phages is more than 0.75 even for novel phages, indicating the high accuracy and specificity of the PhageTailFinder. When applied to a dataset containing 189,680 viral genomes derived from 11,810 bulk metagenomic human stool samples, the ROC value was 0.895. In addition, tail protein clusters could be identified for further studies by density-based spatial clustering of applications with the noise algorithm (DBSCAN). The developed PhageTailFinder tool can be accessed either as a web server (http://www.microbiome-bigdata.com/PHISDetector/index/tools/PhageTailFinder) or as a stand-alone program on a standard desktop computer (https://github.com/HIT-ImmunologyLab/PhageTailFinder).https://www.frontiersin.org/articles/10.3389/fgene.2023.947466/fullphagetail gene clustertwo-state HMMDBSCANphage therapy
spellingShingle Fengxia Zhou
Han Yang
Yu Si
Rui Gan
Ling Yu
Chuangeng Chen
Chunyan Ren
Jiqiu Wu
Fan Zhang
Fan Zhang
PhageTailFinder: A tool for phage tail module detection and annotation
Frontiers in Genetics
phage
tail gene cluster
two-state HMM
DBSCAN
phage therapy
title PhageTailFinder: A tool for phage tail module detection and annotation
title_full PhageTailFinder: A tool for phage tail module detection and annotation
title_fullStr PhageTailFinder: A tool for phage tail module detection and annotation
title_full_unstemmed PhageTailFinder: A tool for phage tail module detection and annotation
title_short PhageTailFinder: A tool for phage tail module detection and annotation
title_sort phagetailfinder a tool for phage tail module detection and annotation
topic phage
tail gene cluster
two-state HMM
DBSCAN
phage therapy
url https://www.frontiersin.org/articles/10.3389/fgene.2023.947466/full
work_keys_str_mv AT fengxiazhou phagetailfinderatoolforphagetailmoduledetectionandannotation
AT hanyang phagetailfinderatoolforphagetailmoduledetectionandannotation
AT yusi phagetailfinderatoolforphagetailmoduledetectionandannotation
AT ruigan phagetailfinderatoolforphagetailmoduledetectionandannotation
AT lingyu phagetailfinderatoolforphagetailmoduledetectionandannotation
AT chuangengchen phagetailfinderatoolforphagetailmoduledetectionandannotation
AT chunyanren phagetailfinderatoolforphagetailmoduledetectionandannotation
AT jiqiuwu phagetailfinderatoolforphagetailmoduledetectionandannotation
AT fanzhang phagetailfinderatoolforphagetailmoduledetectionandannotation
AT fanzhang phagetailfinderatoolforphagetailmoduledetectionandannotation