Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis

Abstract Background Regardless of improvements in controlling the COVID-19 pandemic, the lack of comprehensive insight into SARS-COV-2 pathogenesis is still a sophisticated challenge. In order to deal with this challenge, we utilized advanced bioinformatics and machine learning algorithms to reveal...

Full description

Bibliographic Details
Main Authors: Samaneh Maleknia, Mohammad Javad Tavassolifar, Faezeh Mottaghitalab, Mohammad Reza Zali, Anna Meyfour
Format: Article
Language:English
Published: BMC 2022-08-01
Series:Molecular Medicine
Subjects:
Online Access:https://doi.org/10.1186/s10020-022-00513-5
_version_ 1811315153369563136
author Samaneh Maleknia
Mohammad Javad Tavassolifar
Faezeh Mottaghitalab
Mohammad Reza Zali
Anna Meyfour
author_facet Samaneh Maleknia
Mohammad Javad Tavassolifar
Faezeh Mottaghitalab
Mohammad Reza Zali
Anna Meyfour
author_sort Samaneh Maleknia
collection DOAJ
description Abstract Background Regardless of improvements in controlling the COVID-19 pandemic, the lack of comprehensive insight into SARS-COV-2 pathogenesis is still a sophisticated challenge. In order to deal with this challenge, we utilized advanced bioinformatics and machine learning algorithms to reveal more characteristics of SARS-COV-2 pathogenesis and introduce novel host response-based diagnostic biomarker panels. Methods In the present study, eight published RNA-Seq datasets related to whole-blood (WB) and nasopharyngeal (NP) swab samples of patients with COVID-19, other viral and non-viral acute respiratory illnesses (ARIs), and healthy controls (HCs) were integrated. To define COVID-19 meta-signatures, Gene Ontology and pathway enrichment analyses were applied to compare COVID-19 with other similar diseases. Additionally, CIBERSORTx was executed in WB samples to detect the immune cell landscape. Furthermore, the optimum WB- and NP-based diagnostic biomarkers were identified via all the combinations of 3 to 9 selected features and the 2-phases machine learning (ML) method which implemented k-fold cross validation and independent test set validation. Results The host gene meta-signatures obtained for SARS-COV-2 infection were different in the WB and NP samples. The gene ontology and enrichment results of the WB dataset represented the enhancement in inflammatory host response, cell cycle, and interferon signature in COVID-19 patients. Furthermore, NP samples of COVID-19 in comparison with HC and non-viral ARIs showed the significant upregulation of genes associated with cytokine production and defense response to the virus. In contrast, these pathways in COVID-19 compared to other viral ARIs were strikingly attenuated. Notably, immune cell proportions of WB samples altered in COVID-19 versus HC. Moreover, the optimum WB- and NP-based diagnostic panels after two phases of ML-based validation included 6 and 8 markers with an accuracy of 97% and 88%, respectively. Conclusions Based on the distinct gene expression profiles of WB and NP, our results indicated that SARS-COV-2 function is body-site-specific, although according to the common signature in WB and NP COVID-19 samples versus controls, this virus also induces a global and systematic host response to some extent. We also introduced and validated WB- and NP-based diagnostic biomarkers using ML methods which can be applied as a complementary tool to diagnose the COVID-19 infection from non-COVID cases.
first_indexed 2024-04-13T11:25:07Z
format Article
id doaj.art-4d651ba6cd0043ccb7b77ca407d4d304
institution Directory Open Access Journal
issn 1076-1551
1528-3658
language English
last_indexed 2024-04-13T11:25:07Z
publishDate 2022-08-01
publisher BMC
record_format Article
series Molecular Medicine
spelling doaj.art-4d651ba6cd0043ccb7b77ca407d4d3042022-12-22T02:48:43ZengBMCMolecular Medicine1076-15511528-36582022-08-0128112010.1186/s10020-022-00513-5Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysisSamaneh Maleknia0Mohammad Javad Tavassolifar1Faezeh Mottaghitalab2Mohammad Reza Zali3Anna Meyfour4Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical SciencesBasic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical SciencesBasic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical SciencesGastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical SciencesBasic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical SciencesAbstract Background Regardless of improvements in controlling the COVID-19 pandemic, the lack of comprehensive insight into SARS-COV-2 pathogenesis is still a sophisticated challenge. In order to deal with this challenge, we utilized advanced bioinformatics and machine learning algorithms to reveal more characteristics of SARS-COV-2 pathogenesis and introduce novel host response-based diagnostic biomarker panels. Methods In the present study, eight published RNA-Seq datasets related to whole-blood (WB) and nasopharyngeal (NP) swab samples of patients with COVID-19, other viral and non-viral acute respiratory illnesses (ARIs), and healthy controls (HCs) were integrated. To define COVID-19 meta-signatures, Gene Ontology and pathway enrichment analyses were applied to compare COVID-19 with other similar diseases. Additionally, CIBERSORTx was executed in WB samples to detect the immune cell landscape. Furthermore, the optimum WB- and NP-based diagnostic biomarkers were identified via all the combinations of 3 to 9 selected features and the 2-phases machine learning (ML) method which implemented k-fold cross validation and independent test set validation. Results The host gene meta-signatures obtained for SARS-COV-2 infection were different in the WB and NP samples. The gene ontology and enrichment results of the WB dataset represented the enhancement in inflammatory host response, cell cycle, and interferon signature in COVID-19 patients. Furthermore, NP samples of COVID-19 in comparison with HC and non-viral ARIs showed the significant upregulation of genes associated with cytokine production and defense response to the virus. In contrast, these pathways in COVID-19 compared to other viral ARIs were strikingly attenuated. Notably, immune cell proportions of WB samples altered in COVID-19 versus HC. Moreover, the optimum WB- and NP-based diagnostic panels after two phases of ML-based validation included 6 and 8 markers with an accuracy of 97% and 88%, respectively. Conclusions Based on the distinct gene expression profiles of WB and NP, our results indicated that SARS-COV-2 function is body-site-specific, although according to the common signature in WB and NP COVID-19 samples versus controls, this virus also induces a global and systematic host response to some extent. We also introduced and validated WB- and NP-based diagnostic biomarkers using ML methods which can be applied as a complementary tool to diagnose the COVID-19 infection from non-COVID cases.https://doi.org/10.1186/s10020-022-00513-5COVID-19BiomarkerData integrationSystems biologyWhole bloodNasopharyngeal swab
spellingShingle Samaneh Maleknia
Mohammad Javad Tavassolifar
Faezeh Mottaghitalab
Mohammad Reza Zali
Anna Meyfour
Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
Molecular Medicine
COVID-19
Biomarker
Data integration
Systems biology
Whole blood
Nasopharyngeal swab
title Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
title_full Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
title_fullStr Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
title_full_unstemmed Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
title_short Identifying novel host-based diagnostic biomarker panels for COVID-19: a whole-blood/nasopharyngeal transcriptome meta-analysis
title_sort identifying novel host based diagnostic biomarker panels for covid 19 a whole blood nasopharyngeal transcriptome meta analysis
topic COVID-19
Biomarker
Data integration
Systems biology
Whole blood
Nasopharyngeal swab
url https://doi.org/10.1186/s10020-022-00513-5
work_keys_str_mv AT samanehmaleknia identifyingnovelhostbaseddiagnosticbiomarkerpanelsforcovid19awholebloodnasopharyngealtranscriptomemetaanalysis
AT mohammadjavadtavassolifar identifyingnovelhostbaseddiagnosticbiomarkerpanelsforcovid19awholebloodnasopharyngealtranscriptomemetaanalysis
AT faezehmottaghitalab identifyingnovelhostbaseddiagnosticbiomarkerpanelsforcovid19awholebloodnasopharyngealtranscriptomemetaanalysis
AT mohammadrezazali identifyingnovelhostbaseddiagnosticbiomarkerpanelsforcovid19awholebloodnasopharyngealtranscriptomemetaanalysis
AT annameyfour identifyingnovelhostbaseddiagnosticbiomarkerpanelsforcovid19awholebloodnasopharyngealtranscriptomemetaanalysis