sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations

Abstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with...

Full description

Bibliographic Details
Main Authors: Chengyang Ji, Na Han, Yexiao Cheng, Jingzhe Shang, Shenghui Weng, Rong Yang, Hang-Yu Zhou, Aiping Wu
Format: Article
Language:English
Published: BMC 2022-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-022-05064-4
_version_ 1811186878248910848
author Chengyang Ji
Na Han
Yexiao Cheng
Jingzhe Shang
Shenghui Weng
Rong Yang
Hang-Yu Zhou
Aiping Wu
author_facet Chengyang Ji
Na Han
Yexiao Cheng
Jingzhe Shang
Shenghui Weng
Rong Yang
Hang-Yu Zhou
Aiping Wu
author_sort Chengyang Ji
collection DOAJ
description Abstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed. Results Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software. Conclusions The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath.
first_indexed 2024-04-11T13:53:39Z
format Article
id doaj.art-5e1e37572b2c4cd2ba387b5e25f9ade3
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-11T13:53:39Z
publishDate 2022-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-5e1e37572b2c4cd2ba387b5e25f9ade32022-12-22T04:20:27ZengBMCBMC Bioinformatics1471-21052022-11-012311710.1186/s12859-022-05064-4sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutationsChengyang Ji0Na Han1Yexiao Cheng2Jingzhe Shang3Shenghui Weng4Rong Yang5Hang-Yu Zhou6Aiping Wu7Institute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeAbstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed. Results Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software. Conclusions The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath.https://doi.org/10.1186/s12859-022-05064-4PhylogeneticsSequence analysisVisualization
spellingShingle Chengyang Ji
Na Han
Yexiao Cheng
Jingzhe Shang
Shenghui Weng
Rong Yang
Hang-Yu Zhou
Aiping Wu
sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
BMC Bioinformatics
Phylogenetics
Sequence analysis
Visualization
title sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
title_full sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
title_fullStr sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
title_full_unstemmed sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
title_short sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
title_sort sitepath a visual tool to identify polymorphism clades and help find fixed and parallel mutations
topic Phylogenetics
Sequence analysis
Visualization
url https://doi.org/10.1186/s12859-022-05064-4
work_keys_str_mv AT chengyangji sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT nahan sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT yexiaocheng sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT jingzheshang sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT shenghuiweng sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT rongyang sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT hangyuzhou sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations
AT aipingwu sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations