SARS-CoV-2 virus classification based on stacked sparse autoencoder

Since December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and...

Full description

Bibliographic Details
Main Authors:	Maria G.F. Coutinho, Gabriel B.M. Câmara, Raquel de M. Barbosa, Marcelo A.C. Fernandes
Format:	Article
Language:	English
Published:	Elsevier 2023-01-01
Series:	Computational and Structural Biotechnology Journal
Subjects:	COVID-19 Deep learning SARS-CoV-2 Sparse autoencoder Viral classification
Online Access:	http://www.sciencedirect.com/science/article/pii/S2001037022005633

_version_	1827577697326333952
author	Maria G.F. Coutinho Gabriel B.M. Câmara Raquel de M. Barbosa Marcelo A.C. Fernandes
author_facet	Maria G.F. Coutinho Gabriel B.M. Câmara Raquel de M. Barbosa Marcelo A.C. Fernandes
author_sort	Maria G.F. Coutinho
collection	DOAJ
description	Since December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and treatments. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. Considering that motivation, the authors proposed an efficient viral genome classifier for the SARS-CoV-2 using the deep neural network based on the stacked sparse autoencoder (SSAE). For the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input to provide a classification of the SARS-CoV-2. For that, a dataset based on k-mers image representation was applied. We performed four experiments to provide different levels of taxonomic classification of the SARS-CoV-2. The SSAE technique provided great performance results in all experiments, achieving classification accuracy between 92% and 100% for the validation set and between 98.9% and 100% when the SARS-CoV-2 samples were applied for the test set. In this work, samples of the SARS-CoV-2 were not used during the training process, only during subsequent tests, in which the model was able to infer the correct classification of the samples in the vast majority of cases. This indicates that our model can be adapted to classify other emerging viruses. Finally, the results indicated the applicability of this deep learning technique in genome classification problems.
first_indexed	2024-03-08T21:31:35Z
format	Article
id	doaj.art-c087946b2c9c4d91a698022e3e3c53ad
institution	Directory Open Access Journal
issn	2001-0370
language	English
last_indexed	2024-03-08T21:31:35Z
publishDate	2023-01-01
publisher	Elsevier
record_format	Article
series	Computational and Structural Biotechnology Journal
spelling	doaj.art-c087946b2c9c4d91a698022e3e3c53ad2023-12-21T07:30:23ZengElsevierComputational and Structural Biotechnology Journal2001-03702023-01-0121284298SARS-CoV-2 virus classification based on stacked sparse autoencoderMaria G.F. Coutinho0Gabriel B.M. Câmara1Raquel de M. Barbosa2Marcelo A.C. Fernandes3Laboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, BrazilLaboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, BrazilDepartment of Pharmacy and Pharmaceutical Technology, University of Granada, 18071 Granada, SpainLaboratory of Machine Learning and Intelligent Instrumentation, IMD/nPITI, Federal University of Rio Grande do Norte, Natal, Brazil; Department of Computer and Automation Engineering, Federal University of Rio Grande do Norte, Natal, Brazil; Corresponding author at: Department of Computer and Automation Engineering, Federal University of Rio Grande do Norte, Natal, Brazil.Since December 2019, the world has been intensely affected by the COVID-19 pandemic, caused by the SARS-CoV-2. In the case of a novel virus identification, the early elucidation of taxonomic classification and origin of the virus genomic sequence is essential for strategic planning, containment, and treatments. Deep learning techniques have been successfully used in many viral classification problems associated with viral infection diagnosis, metagenomics, phylogenetics, and analysis. Considering that motivation, the authors proposed an efficient viral genome classifier for the SARS-CoV-2 using the deep neural network based on the stacked sparse autoencoder (SSAE). For the best performance of the model, we explored the utilization of image representations of the complete genome sequences as the SSAE input to provide a classification of the SARS-CoV-2. For that, a dataset based on k-mers image representation was applied. We performed four experiments to provide different levels of taxonomic classification of the SARS-CoV-2. The SSAE technique provided great performance results in all experiments, achieving classification accuracy between 92% and 100% for the validation set and between 98.9% and 100% when the SARS-CoV-2 samples were applied for the test set. In this work, samples of the SARS-CoV-2 were not used during the training process, only during subsequent tests, in which the model was able to infer the correct classification of the samples in the vast majority of cases. This indicates that our model can be adapted to classify other emerging viruses. Finally, the results indicated the applicability of this deep learning technique in genome classification problems.http://www.sciencedirect.com/science/article/pii/S2001037022005633COVID-19Deep learningSARS-CoV-2Sparse autoencoderViral classification
spellingShingle	Maria G.F. Coutinho Gabriel B.M. Câmara Raquel de M. Barbosa Marcelo A.C. Fernandes SARS-CoV-2 virus classification based on stacked sparse autoencoder Computational and Structural Biotechnology Journal COVID-19 Deep learning SARS-CoV-2 Sparse autoencoder Viral classification
title	SARS-CoV-2 virus classification based on stacked sparse autoencoder
title_full	SARS-CoV-2 virus classification based on stacked sparse autoencoder
title_fullStr	SARS-CoV-2 virus classification based on stacked sparse autoencoder
title_full_unstemmed	SARS-CoV-2 virus classification based on stacked sparse autoencoder
title_short	SARS-CoV-2 virus classification based on stacked sparse autoencoder
title_sort	sars cov 2 virus classification based on stacked sparse autoencoder
topic	COVID-19 Deep learning SARS-CoV-2 Sparse autoencoder Viral classification
url	http://www.sciencedirect.com/science/article/pii/S2001037022005633
work_keys_str_mv	AT mariagfcoutinho sarscov2virusclassificationbasedonstackedsparseautoencoder AT gabrielbmcamara sarscov2virusclassificationbasedonstackedsparseautoencoder AT raqueldembarbosa sarscov2virusclassificationbasedonstackedsparseautoencoder AT marceloacfernandes sarscov2virusclassificationbasedonstackedsparseautoencoder

SARS-CoV-2 virus classification based on stacked sparse autoencoder

Similar Items