Analysis of the RNA-Seq Data of Solanum tuberosum Revealed Viral Sequence Reads of a Severe Laboratory-Developed Strain of SARS-CoV-2 Containing Novel Substitutions

Background: Metagenomics is a promising approach to discovering novel sequences of microorganisms in environmental samples. A recently published RNA-Seq data of Solanum tuberosum from China was used for a metavirome study. Methods: RNA-seq data of a BioSample project of S. tuberosum containing seque...

Full description

Bibliographic Details
Main Author: Alireza Mohebbi
Format: Article
Language:English
Published: Golestan University Of Medical Sciences 2022-06-01
Series:Journal of Clinical and Basic Research
Subjects:
Online Access:http://jcbr.goums.ac.ir/article-1-391-en.pdf
Description
Summary:Background: Metagenomics is a promising approach to discovering novel sequences of microorganisms in environmental samples. A recently published RNA-Seq data of Solanum tuberosum from China was used for a metavirome study. Methods: RNA-seq data of a BioSample project of S. tuberosum containing sequence read archive (SRA) of six plant samples were imported into the Galaxy server. Transcriptome data were de novo assembled for viral sequences with rnaviralSPAdes. The contig files were further organized by VirHunter and Kraken tools. Raw SRA data were trimmed and assembled for severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) genome by coronaSPAdes. The scaffolds were arranged by pairwise alignment against the SARS-CoV-2 reference genome (NC_045512.2). Coronavirus Typing Tool, Nextstrain, and Pangolin platforms were used to further investigate the SARS-CoV-2 genotype, phylogenetic analysis, and mutation estimations. Results: Several environmentally related non-intact virus sequence reads from forest animals, moths, bacteria, and amoeba were detected. Further investigation resulted in non-indigenous sequences of SARS-CoV-2 genomes of lineage B with novel substitutions. Three polymorphisms, including A22D and A36V in the envelope protein, and Q498H in the spike (S) glycoprotein that were recently reported from a mice-adopted strain of SARS-CoV-2 with enhanced virulence were detected in all samples. Further novel substitutions at ORF1ab were also uncovered. These were L1457V, D4553N, W6538S, I1525T, D1585Y, D6928G, N3414K, and T3432S. Two unexpected frameshifts, ORF1a:2338-4401 and ORF1a:3681-4401, were also detected. Conclusion: The findings of the presented study highlight the threats of the emerged potentially severe genotypes of SARS-CoV-2 bearing substitutions that are not yet clinically reported.
ISSN:2538-3736