Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country

Coronavirus disease (COVID-19), caused by the virus SARS-CoV-2, is already responsible for more than 4.3 million confirmed cases and 295,000 deaths worldwide as of May 15, 2020. Ongoing efforts to control the pandemic include the development of peptide-based vaccines and diagnostic tests. In these a...

Full description

Bibliographic Details
Main Authors: David Requena, Aldhair Médico, Ruy D. Chacón, Manuel Ramírez, Obert Marín-Sánchez
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-09-01
Series:Frontiers in Immunology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fimmu.2020.02008/full
_version_ 1818402302103388160
author David Requena
Aldhair Médico
Ruy D. Chacón
Manuel Ramírez
Obert Marín-Sánchez
author_facet David Requena
Aldhair Médico
Ruy D. Chacón
Manuel Ramírez
Obert Marín-Sánchez
author_sort David Requena
collection DOAJ
description Coronavirus disease (COVID-19), caused by the virus SARS-CoV-2, is already responsible for more than 4.3 million confirmed cases and 295,000 deaths worldwide as of May 15, 2020. Ongoing efforts to control the pandemic include the development of peptide-based vaccines and diagnostic tests. In these approaches, HLA allelic diversity plays a crucial role. Despite its importance, current knowledge of HLA allele frequencies in South America is very limited. In this study, we have performed a literature review of datasets reporting HLA frequencies of South American populations, available in scientific literature and/or in the Allele Frequency Net Database. This allowed us to enrich the current scenario with more than 12.8 million data points. As a result, we are presenting updated HLA allelic frequencies based on country, including 91 alleles that were previously thought to have frequencies either under 5% or of an unknown value. Using alleles with an updated frequency of at least ≥5% in any South American country, we predicted epitopes in SARS-CoV-2 proteins using NetMHCpan (I and II) and MHC flurry. Then, the best predicted epitopes (class-I and -II) were selected based on their binding to South American alleles (Coverage Score). Class II predicted epitopes were also filtered based on their three-dimensional exposure. We obtained 14 class-I and four class-II candidate epitopes with experimental evidence (reported in the Immune Epitope Database and Analysis Resource), having good coverage scores for South America. Additionally, we are presenting 13 HLA-I and 30 HLA-II novel candidate epitopes without experimental evidence, including 16 class-II candidates in highly exposed conserved areas of the NTD and RBD regions of the Spike protein. These novel candidates have even better coverage scores for South America than those with experimental evidence. Finally, we show that recent similar studies presenting candidate epitopes also predicted some of our candidates but discarded them in the selection process, resulting in candidates with suboptimal coverage for South America. In conclusion, the candidate epitopes presented provide valuable information for the development of epitope-based strategies against SARS-CoV-2, such as peptide vaccines and diagnostic tests. Additionally, the updated HLA allelic frequencies provide a better representation of South America and may impact different immunogenetic studies.
first_indexed 2024-12-14T08:06:12Z
format Article
id doaj.art-8adbcb5795f34a3db7a288186b25642e
institution Directory Open Access Journal
issn 1664-3224
language English
last_indexed 2024-12-14T08:06:12Z
publishDate 2020-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Immunology
spelling doaj.art-8adbcb5795f34a3db7a288186b25642e2022-12-21T23:10:10ZengFrontiers Media S.A.Frontiers in Immunology1664-32242020-09-011110.3389/fimmu.2020.02008562671Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by CountryDavid Requena0Aldhair Médico1Ruy D. Chacón2Manuel Ramírez3Obert Marín-Sánchez4Laboratory of Cellular Biophysics, The Rockefeller University, New York, NY, United StatesLaboratorio de Bioinformática, Biología Molecular y Desarrollos Tecnológicos, Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, PeruDepartamento de Patologia, Faculdade de Medicina Veterinária e Zootecnia, Programa Interunidades em Biotecnologia, Universidade de São Paulo, São Paulo, BrazilUnidad de Bioinformática, Centro de Investigaciones Tecnológicas, Biomédicas y Medioambientales, Lima, PeruDepartamento Académico de Microbiología Médica, Facultad de Medicina, Universidad Nacional Mayor de San Marcos, Lima, PeruCoronavirus disease (COVID-19), caused by the virus SARS-CoV-2, is already responsible for more than 4.3 million confirmed cases and 295,000 deaths worldwide as of May 15, 2020. Ongoing efforts to control the pandemic include the development of peptide-based vaccines and diagnostic tests. In these approaches, HLA allelic diversity plays a crucial role. Despite its importance, current knowledge of HLA allele frequencies in South America is very limited. In this study, we have performed a literature review of datasets reporting HLA frequencies of South American populations, available in scientific literature and/or in the Allele Frequency Net Database. This allowed us to enrich the current scenario with more than 12.8 million data points. As a result, we are presenting updated HLA allelic frequencies based on country, including 91 alleles that were previously thought to have frequencies either under 5% or of an unknown value. Using alleles with an updated frequency of at least ≥5% in any South American country, we predicted epitopes in SARS-CoV-2 proteins using NetMHCpan (I and II) and MHC flurry. Then, the best predicted epitopes (class-I and -II) were selected based on their binding to South American alleles (Coverage Score). Class II predicted epitopes were also filtered based on their three-dimensional exposure. We obtained 14 class-I and four class-II candidate epitopes with experimental evidence (reported in the Immune Epitope Database and Analysis Resource), having good coverage scores for South America. Additionally, we are presenting 13 HLA-I and 30 HLA-II novel candidate epitopes without experimental evidence, including 16 class-II candidates in highly exposed conserved areas of the NTD and RBD regions of the Spike protein. These novel candidates have even better coverage scores for South America than those with experimental evidence. Finally, we show that recent similar studies presenting candidate epitopes also predicted some of our candidates but discarded them in the selection process, resulting in candidates with suboptimal coverage for South America. In conclusion, the candidate epitopes presented provide valuable information for the development of epitope-based strategies against SARS-CoV-2, such as peptide vaccines and diagnostic tests. Additionally, the updated HLA allelic frequencies provide a better representation of South America and may impact different immunogenetic studies.https://www.frontiersin.org/article/10.3389/fimmu.2020.02008/fullallele frequencyHLAliterature reviewSouth Americaepitopeimmunoinformatics
spellingShingle David Requena
Aldhair Médico
Ruy D. Chacón
Manuel Ramírez
Obert Marín-Sánchez
Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
Frontiers in Immunology
allele frequency
HLA
literature review
South America
epitope
immunoinformatics
title Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
title_full Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
title_fullStr Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
title_full_unstemmed Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
title_short Identification of Novel Candidate Epitopes on SARS-CoV-2 Proteins for South America: A Review of HLA Frequencies by Country
title_sort identification of novel candidate epitopes on sars cov 2 proteins for south america a review of hla frequencies by country
topic allele frequency
HLA
literature review
South America
epitope
immunoinformatics
url https://www.frontiersin.org/article/10.3389/fimmu.2020.02008/full
work_keys_str_mv AT davidrequena identificationofnovelcandidateepitopesonsarscov2proteinsforsouthamericaareviewofhlafrequenciesbycountry
AT aldhairmedico identificationofnovelcandidateepitopesonsarscov2proteinsforsouthamericaareviewofhlafrequenciesbycountry
AT ruydchacon identificationofnovelcandidateepitopesonsarscov2proteinsforsouthamericaareviewofhlafrequenciesbycountry
AT manuelramirez identificationofnovelcandidateepitopesonsarscov2proteinsforsouthamericaareviewofhlafrequenciesbycountry
AT obertmarinsanchez identificationofnovelcandidateepitopesonsarscov2proteinsforsouthamericaareviewofhlafrequenciesbycountry