Using machine learning to detect coronaviruses potentially infectious to humans
Abstract Establishing the host range for novel viruses remains a challenge. Here, we address the challenge of identifying non-human animal coronaviruses that may infect humans by creating an artificial neural network model that learns from spike protein sequences of alpha and beta coronaviruses and...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2023-06-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-023-35861-7 |
_version_ | 1797806732088442880 |
---|---|
author | Georgina Gonzalez-Isunza M. Zaki Jawaid Pengyu Liu Daniel L. Cox Mariel Vazquez Javier Arsuaga |
author_facet | Georgina Gonzalez-Isunza M. Zaki Jawaid Pengyu Liu Daniel L. Cox Mariel Vazquez Javier Arsuaga |
author_sort | Georgina Gonzalez-Isunza |
collection | DOAJ |
description | Abstract Establishing the host range for novel viruses remains a challenge. Here, we address the challenge of identifying non-human animal coronaviruses that may infect humans by creating an artificial neural network model that learns from spike protein sequences of alpha and beta coronaviruses and their binding annotation to their host receptor. The proposed method produces a human-Binding Potential (h-BiP) score that distinguishes, with high accuracy, the binding potential among coronaviruses. Three viruses, previously unknown to bind human receptors, were identified: Bat coronavirus BtCoV/133/2005 and Pipistrellus abramus bat coronavirus HKU5-related (both MERS related viruses), and Rhinolophus affinis coronavirus isolate LYRa3 (a SARS related virus). We further analyze the binding properties of BtCoV/133/2005 and LYRa3 using molecular dynamics. To test whether this model can be used for surveillance of novel coronaviruses, we re-trained the model on a set that excludes SARS-CoV-2 and all viral sequences released after the SARS-CoV-2 was published. The results predict the binding of SARS-CoV-2 with a human receptor, indicating that machine learning methods are an excellent tool for the prediction of host expansion events. |
first_indexed | 2024-03-13T06:11:45Z |
format | Article |
id | doaj.art-b2c2e98ae64347ba969d848fbde6139e |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-03-13T06:11:45Z |
publishDate | 2023-06-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-b2c2e98ae64347ba969d848fbde6139e2023-06-11T11:14:20ZengNature PortfolioScientific Reports2045-23222023-06-0113111210.1038/s41598-023-35861-7Using machine learning to detect coronaviruses potentially infectious to humansGeorgina Gonzalez-Isunza0M. Zaki Jawaid1Pengyu Liu2Daniel L. Cox3Mariel Vazquez4Javier Arsuaga5Department of Microbiology and Molecular Genetics, University of CaliforniaDepartment of Physics, University of CaliforniaDepartment of Microbiology and Molecular Genetics, University of CaliforniaDepartment of Physics, University of CaliforniaDepartment of Microbiology and Molecular Genetics, University of CaliforniaDepartment of Molecular and Cellular Biology, University of CaliforniaAbstract Establishing the host range for novel viruses remains a challenge. Here, we address the challenge of identifying non-human animal coronaviruses that may infect humans by creating an artificial neural network model that learns from spike protein sequences of alpha and beta coronaviruses and their binding annotation to their host receptor. The proposed method produces a human-Binding Potential (h-BiP) score that distinguishes, with high accuracy, the binding potential among coronaviruses. Three viruses, previously unknown to bind human receptors, were identified: Bat coronavirus BtCoV/133/2005 and Pipistrellus abramus bat coronavirus HKU5-related (both MERS related viruses), and Rhinolophus affinis coronavirus isolate LYRa3 (a SARS related virus). We further analyze the binding properties of BtCoV/133/2005 and LYRa3 using molecular dynamics. To test whether this model can be used for surveillance of novel coronaviruses, we re-trained the model on a set that excludes SARS-CoV-2 and all viral sequences released after the SARS-CoV-2 was published. The results predict the binding of SARS-CoV-2 with a human receptor, indicating that machine learning methods are an excellent tool for the prediction of host expansion events.https://doi.org/10.1038/s41598-023-35861-7 |
spellingShingle | Georgina Gonzalez-Isunza M. Zaki Jawaid Pengyu Liu Daniel L. Cox Mariel Vazquez Javier Arsuaga Using machine learning to detect coronaviruses potentially infectious to humans Scientific Reports |
title | Using machine learning to detect coronaviruses potentially infectious to humans |
title_full | Using machine learning to detect coronaviruses potentially infectious to humans |
title_fullStr | Using machine learning to detect coronaviruses potentially infectious to humans |
title_full_unstemmed | Using machine learning to detect coronaviruses potentially infectious to humans |
title_short | Using machine learning to detect coronaviruses potentially infectious to humans |
title_sort | using machine learning to detect coronaviruses potentially infectious to humans |
url | https://doi.org/10.1038/s41598-023-35861-7 |
work_keys_str_mv | AT georginagonzalezisunza usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans AT mzakijawaid usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans AT pengyuliu usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans AT daniellcox usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans AT marielvazquez usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans AT javierarsuaga usingmachinelearningtodetectcoronavirusespotentiallyinfectioustohumans |