Origin of novel coronavirus causing COVID-19: A computational biology study using artificial intelligence

Origin of the COVID-19 virus (SARS-CoV-2) has been intensely debated in the scientific community since the first infected cases were detected in December 2019. The disease has caused a global pandemic, leading to deaths of thousands of people across the world and thus finding origin of this novel co...

Full description

Bibliographic Details
Main Authors: Thanh Thi Nguyen, Mohamed Abdelrazek, Dung Tien Nguyen, Sunil Aryal, Duc Thanh Nguyen, Sandeep Reddy, Quoc Viet Hung Nguyen, Amin Khatami, Thanh Tam Nguyen, Edbert B. Hsu, Samuel Yang
Format: Article
Language:English
Published: Elsevier 2022-09-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S266682702200041X
Description
Summary:Origin of the COVID-19 virus (SARS-CoV-2) has been intensely debated in the scientific community since the first infected cases were detected in December 2019. The disease has caused a global pandemic, leading to deaths of thousands of people across the world and thus finding origin of this novel coronavirus is important in responding and controlling the pandemic. Recent research results suggest that bats or pangolins might be the hosts for SARS-CoV-2 based on comparative studies using its genomic sequences. This paper investigates the SARS-CoV-2 origin by using artificial intelligence (AI)-based unsupervised learning algorithms and raw genomic sequences of the virus. More than 300 genome sequences of COVID-19 infected cases collected from different countries are explored and analysed using unsupervised clustering methods. The results obtained from various AI-enabled experiments using clustering algorithms demonstrate that all examined SARS-CoV-2 genomes belong to a cluster that also contains bat and pangolin coronavirus genomes. This provides evidence strongly supporting scientific hypotheses that bats and pangolins are probable hosts for SARS-CoV-2. At the whole genome analysis level, our findings also indicate that bats are more likely the hosts for the COVID-19 virus than pangolins.
ISSN:2666-8270