sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations
Abstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2022-11-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-022-05064-4 |
_version_ | 1811186878248910848 |
---|---|
author | Chengyang Ji Na Han Yexiao Cheng Jingzhe Shang Shenghui Weng Rong Yang Hang-Yu Zhou Aiping Wu |
author_facet | Chengyang Ji Na Han Yexiao Cheng Jingzhe Shang Shenghui Weng Rong Yang Hang-Yu Zhou Aiping Wu |
author_sort | Chengyang Ji |
collection | DOAJ |
description | Abstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed. Results Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software. Conclusions The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath. |
first_indexed | 2024-04-11T13:53:39Z |
format | Article |
id | doaj.art-5e1e37572b2c4cd2ba387b5e25f9ade3 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-11T13:53:39Z |
publishDate | 2022-11-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-5e1e37572b2c4cd2ba387b5e25f9ade32022-12-22T04:20:27ZengBMCBMC Bioinformatics1471-21052022-11-012311710.1186/s12859-022-05064-4sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutationsChengyang Ji0Na Han1Yexiao Cheng2Jingzhe Shang3Shenghui Weng4Rong Yang5Hang-Yu Zhou6Aiping Wu7Institute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeInstitute of Systems Medicine, Chinese Academy of Medical Sciences and Peking Union Medical CollegeAbstract Background Identifying polymorphism clades on phylogenetic trees could help detect punctual mutations that are associated with viral functions. With visualization tools coloring the tree, it is easy to visually find clades where most sequences have the same polymorphism state. However, with the fast accumulation of viral sequences, a computational tool to automate this process is urgently needed. Results Here, by implementing a branch-and-bound-like search method, we developed an R package named sitePath to identify polymorphism clades automatically. Based on the identified polymorphism clades, fixed and parallel mutations could be inferred. Furthermore, sitePath also integrated visualization tools to generate figures of the calculated results. In an example with the influenza A virus H3N2 dataset, the detected fixed mutations coincide with antigenic shift mutations. The highly specificity and sensitivity of sitePath in finding fixed mutations were achieved for a range of parameters and different phylogenetic tree inference software. Conclusions The result suggests that sitePath can identify polymorphism clades per site. The clustering of sequences on a phylogenetic tree can be used to infer fixed and parallel mutations. High-quality figures of the calculated results could also be generated by sitePath.https://doi.org/10.1186/s12859-022-05064-4PhylogeneticsSequence analysisVisualization |
spellingShingle | Chengyang Ji Na Han Yexiao Cheng Jingzhe Shang Shenghui Weng Rong Yang Hang-Yu Zhou Aiping Wu sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations BMC Bioinformatics Phylogenetics Sequence analysis Visualization |
title | sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
title_full | sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
title_fullStr | sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
title_full_unstemmed | sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
title_short | sitePath: a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
title_sort | sitepath a visual tool to identify polymorphism clades and help find fixed and parallel mutations |
topic | Phylogenetics Sequence analysis Visualization |
url | https://doi.org/10.1186/s12859-022-05064-4 |
work_keys_str_mv | AT chengyangji sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT nahan sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT yexiaocheng sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT jingzheshang sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT shenghuiweng sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT rongyang sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT hangyuzhou sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations AT aipingwu sitepathavisualtooltoidentifypolymorphismcladesandhelpfindfixedandparallelmutations |