HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data

Abstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides imp...

Full description

Bibliographic Details
Main Authors: Martin L. Buchkovich, Chad C. Brown, Kimberly Robasky, Shengjie Chai, Sharon Westfall, Benjamin G. Vincent, Eric T. Weimer, Jason G. Powers
Format: Article
Language:English
Published: BMC 2017-09-01
Series:Genome Medicine
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13073-017-0473-6
_version_ 1818875947627053056
author Martin L. Buchkovich
Chad C. Brown
Kimberly Robasky
Shengjie Chai
Sharon Westfall
Benjamin G. Vincent
Eric T. Weimer
Jason G. Powers
author_facet Martin L. Buchkovich
Chad C. Brown
Kimberly Robasky
Shengjie Chai
Sharon Westfall
Benjamin G. Vincent
Eric T. Weimer
Jason G. Powers
author_sort Martin L. Buchkovich
collection DOAJ
description Abstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides important insights into autoimmune disorders, disease susceptibility, oncological immunotherapy, regenerative medicine, transplant rejection, and toxicogenomics. Traditional approaches to HLA typing are low throughput, target only a few genes, are labor intensive and costly, or require specialized protocols. RNA sequencing promises a relatively inexpensive, high-throughput solution for HLA calling across all genes, with the bonus of complete transcriptome information and widespread availability of historical data. Existing tools have been limited in their ability to accurately and comprehensively call HLA genes from RNA-seq data. Results We created HLAProfiler ( https://github.com/ExpressionAnalysis/HLAProfiler ), a k-mer profile-based method for HLA calling in RNA-seq data which can identify rare and common HLA alleles with > 99% accuracy at two-field precision in both biological and simulated data. For 68% of novel alleles not present in the reference database, HLAProfiler can correctly identify the two-field precision or exact coding sequence, a significant advance over existing algorithms. Conclusions HLAProfiler allows for accurate HLA calls in RNA-seq data, reliably expanding the utility of these data in HLA-related research and enabling advances across a broad range of disciplines. Additionally, by using the observed data to identify potential novel alleles and update partial alleles, HLAProfiler will facilitate further improvements to the existing database of reference HLA alleles. HLAProfiler is available at https://expressionanalysis.github.io/HLAProfiler/ .
first_indexed 2024-12-19T13:34:35Z
format Article
id doaj.art-e991dd965ccc46c98ddaccc12564b51c
institution Directory Open Access Journal
issn 1756-994X
language English
last_indexed 2024-12-19T13:34:35Z
publishDate 2017-09-01
publisher BMC
record_format Article
series Genome Medicine
spelling doaj.art-e991dd965ccc46c98ddaccc12564b51c2022-12-21T20:19:16ZengBMCGenome Medicine1756-994X2017-09-019111510.1186/s13073-017-0473-6HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq dataMartin L. Buchkovich0Chad C. Brown1Kimberly Robasky2Shengjie Chai3Sharon Westfall4Benjamin G. Vincent5Eric T. Weimer6Jason G. Powers7Translational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillDepartment of Pathology and Laboratory Medicine, University of North Carolina at Chapel HillTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureAbstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides important insights into autoimmune disorders, disease susceptibility, oncological immunotherapy, regenerative medicine, transplant rejection, and toxicogenomics. Traditional approaches to HLA typing are low throughput, target only a few genes, are labor intensive and costly, or require specialized protocols. RNA sequencing promises a relatively inexpensive, high-throughput solution for HLA calling across all genes, with the bonus of complete transcriptome information and widespread availability of historical data. Existing tools have been limited in their ability to accurately and comprehensively call HLA genes from RNA-seq data. Results We created HLAProfiler ( https://github.com/ExpressionAnalysis/HLAProfiler ), a k-mer profile-based method for HLA calling in RNA-seq data which can identify rare and common HLA alleles with > 99% accuracy at two-field precision in both biological and simulated data. For 68% of novel alleles not present in the reference database, HLAProfiler can correctly identify the two-field precision or exact coding sequence, a significant advance over existing algorithms. Conclusions HLAProfiler allows for accurate HLA calls in RNA-seq data, reliably expanding the utility of these data in HLA-related research and enabling advances across a broad range of disciplines. Additionally, by using the observed data to identify potential novel alleles and update partial alleles, HLAProfiler will facilitate further improvements to the existing database of reference HLA alleles. HLAProfiler is available at https://expressionanalysis.github.io/HLAProfiler/ .http://link.springer.com/article/10.1186/s13073-017-0473-6HLAHSCTTransplantationImmunologyRNA-sequencing
spellingShingle Martin L. Buchkovich
Chad C. Brown
Kimberly Robasky
Shengjie Chai
Sharon Westfall
Benjamin G. Vincent
Eric T. Weimer
Jason G. Powers
HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
Genome Medicine
HLA
HSCT
Transplantation
Immunology
RNA-sequencing
title HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
title_full HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
title_fullStr HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
title_full_unstemmed HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
title_short HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
title_sort hlaprofiler utilizes k mer profiles to improve hla calling accuracy for rare and common alleles in rna seq data
topic HLA
HSCT
Transplantation
Immunology
RNA-sequencing
url http://link.springer.com/article/10.1186/s13073-017-0473-6
work_keys_str_mv AT martinlbuchkovich hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT chadcbrown hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT kimberlyrobasky hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT shengjiechai hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT sharonwestfall hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT benjamingvincent hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT erictweimer hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata
AT jasongpowers hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata