An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coron...

Full description

Bibliographic Details
Main Authors: Anastasios A. Tsonis, Geli Wang, Lvyi Zhang, Wenxu Lu, Aristotle Kayafas, Katia Del Rio-Tsonis
Format: Article
Language:English
Published: BMC 2021-05-01
Series:Human Genomics
Subjects:
Online Access:https://doi.org/10.1186/s40246-021-00327-2
_version_ 1818002488401330176
author Anastasios A. Tsonis
Geli Wang
Lvyi Zhang
Wenxu Lu
Aristotle Kayafas
Katia Del Rio-Tsonis
author_facet Anastasios A. Tsonis
Geli Wang
Lvyi Zhang
Wenxu Lu
Aristotle Kayafas
Katia Del Rio-Tsonis
author_sort Anastasios A. Tsonis
collection DOAJ
description Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.
first_indexed 2024-04-14T03:46:31Z
format Article
id doaj.art-f6a8b45feec943529807dc61bc0421ea
institution Directory Open Access Journal
issn 1479-7364
language English
last_indexed 2024-04-14T03:46:31Z
publishDate 2021-05-01
publisher BMC
record_format Article
series Human Genomics
spelling doaj.art-f6a8b45feec943529807dc61bc0421ea2022-12-22T02:14:14ZengBMCHuman Genomics1479-73642021-05-0115111010.1186/s40246-021-00327-2An application of slow feature analysis to the genetic sequences of coronaviruses and influenza virusesAnastasios A. Tsonis0Geli Wang1Lvyi Zhang2Wenxu Lu3Aristotle Kayafas4Katia Del Rio-Tsonis5Department of Mathematical Sciences, Atmospheric Sciences Group, University of Wisconsin-MilwaukeeKey Laboratory of Middle Atmosphere and Global Environment Observation (LAGEO), Institute of Atmospheric Physics, Chinese Academy of SciencesKey Laboratory of Middle Atmosphere and Global Environment Observation (LAGEO), Institute of Atmospheric Physics, Chinese Academy of SciencesKey Laboratory of Middle Atmosphere and Global Environment Observation (LAGEO), Institute of Atmospheric Physics, Chinese Academy of SciencesDepartment of Biology and Center for Visual Sciences, Miami UniversityDepartment of Biology and Center for Visual Sciences, Miami UniversityAbstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.https://doi.org/10.1186/s40246-021-00327-2DNA complexitySlow feature analysisCoronavirusesInfluenza viruses
spellingShingle Anastasios A. Tsonis
Geli Wang
Lvyi Zhang
Wenxu Lu
Aristotle Kayafas
Katia Del Rio-Tsonis
An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
Human Genomics
DNA complexity
Slow feature analysis
Coronaviruses
Influenza viruses
title An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
title_full An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
title_fullStr An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
title_full_unstemmed An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
title_short An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
title_sort application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses
topic DNA complexity
Slow feature analysis
Coronaviruses
Influenza viruses
url https://doi.org/10.1186/s40246-021-00327-2
work_keys_str_mv AT anastasiosatsonis anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT geliwang anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT lvyizhang anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT wenxulu anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT aristotlekayafas anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT katiadelriotsonis anapplicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT anastasiosatsonis applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT geliwang applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT lvyizhang applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT wenxulu applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT aristotlekayafas applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses
AT katiadelriotsonis applicationofslowfeatureanalysistothegeneticsequencesofcoronavirusesandinfluenzaviruses