HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data
Abstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides imp...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2017-09-01
|
Series: | Genome Medicine |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s13073-017-0473-6 |
_version_ | 1818875947627053056 |
---|---|
author | Martin L. Buchkovich Chad C. Brown Kimberly Robasky Shengjie Chai Sharon Westfall Benjamin G. Vincent Eric T. Weimer Jason G. Powers |
author_facet | Martin L. Buchkovich Chad C. Brown Kimberly Robasky Shengjie Chai Sharon Westfall Benjamin G. Vincent Eric T. Weimer Jason G. Powers |
author_sort | Martin L. Buchkovich |
collection | DOAJ |
description | Abstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides important insights into autoimmune disorders, disease susceptibility, oncological immunotherapy, regenerative medicine, transplant rejection, and toxicogenomics. Traditional approaches to HLA typing are low throughput, target only a few genes, are labor intensive and costly, or require specialized protocols. RNA sequencing promises a relatively inexpensive, high-throughput solution for HLA calling across all genes, with the bonus of complete transcriptome information and widespread availability of historical data. Existing tools have been limited in their ability to accurately and comprehensively call HLA genes from RNA-seq data. Results We created HLAProfiler ( https://github.com/ExpressionAnalysis/HLAProfiler ), a k-mer profile-based method for HLA calling in RNA-seq data which can identify rare and common HLA alleles with > 99% accuracy at two-field precision in both biological and simulated data. For 68% of novel alleles not present in the reference database, HLAProfiler can correctly identify the two-field precision or exact coding sequence, a significant advance over existing algorithms. Conclusions HLAProfiler allows for accurate HLA calls in RNA-seq data, reliably expanding the utility of these data in HLA-related research and enabling advances across a broad range of disciplines. Additionally, by using the observed data to identify potential novel alleles and update partial alleles, HLAProfiler will facilitate further improvements to the existing database of reference HLA alleles. HLAProfiler is available at https://expressionanalysis.github.io/HLAProfiler/ . |
first_indexed | 2024-12-19T13:34:35Z |
format | Article |
id | doaj.art-e991dd965ccc46c98ddaccc12564b51c |
institution | Directory Open Access Journal |
issn | 1756-994X |
language | English |
last_indexed | 2024-12-19T13:34:35Z |
publishDate | 2017-09-01 |
publisher | BMC |
record_format | Article |
series | Genome Medicine |
spelling | doaj.art-e991dd965ccc46c98ddaccc12564b51c2022-12-21T20:19:16ZengBMCGenome Medicine1756-994X2017-09-019111510.1186/s13073-017-0473-6HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq dataMartin L. Buchkovich0Chad C. Brown1Kimberly Robasky2Shengjie Chai3Sharon Westfall4Benjamin G. Vincent5Eric T. Weimer6Jason G. Powers7Translational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureLineberger Comprehensive Cancer Center, University of North Carolina at Chapel HillDepartment of Pathology and Laboratory Medicine, University of North Carolina at Chapel HillTranslational Genomics Department, Q2 Solutions | EA Genomics, a Quintiles Quest Joint VentureAbstract Background The human leukocyte antigen (HLA) system is a genomic region involved in regulating the human immune system by encoding cell membrane major histocompatibility complex (MHC) proteins that are responsible for self-recognition. Understanding the variation in this region provides important insights into autoimmune disorders, disease susceptibility, oncological immunotherapy, regenerative medicine, transplant rejection, and toxicogenomics. Traditional approaches to HLA typing are low throughput, target only a few genes, are labor intensive and costly, or require specialized protocols. RNA sequencing promises a relatively inexpensive, high-throughput solution for HLA calling across all genes, with the bonus of complete transcriptome information and widespread availability of historical data. Existing tools have been limited in their ability to accurately and comprehensively call HLA genes from RNA-seq data. Results We created HLAProfiler ( https://github.com/ExpressionAnalysis/HLAProfiler ), a k-mer profile-based method for HLA calling in RNA-seq data which can identify rare and common HLA alleles with > 99% accuracy at two-field precision in both biological and simulated data. For 68% of novel alleles not present in the reference database, HLAProfiler can correctly identify the two-field precision or exact coding sequence, a significant advance over existing algorithms. Conclusions HLAProfiler allows for accurate HLA calls in RNA-seq data, reliably expanding the utility of these data in HLA-related research and enabling advances across a broad range of disciplines. Additionally, by using the observed data to identify potential novel alleles and update partial alleles, HLAProfiler will facilitate further improvements to the existing database of reference HLA alleles. HLAProfiler is available at https://expressionanalysis.github.io/HLAProfiler/ .http://link.springer.com/article/10.1186/s13073-017-0473-6HLAHSCTTransplantationImmunologyRNA-sequencing |
spellingShingle | Martin L. Buchkovich Chad C. Brown Kimberly Robasky Shengjie Chai Sharon Westfall Benjamin G. Vincent Eric T. Weimer Jason G. Powers HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data Genome Medicine HLA HSCT Transplantation Immunology RNA-sequencing |
title | HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data |
title_full | HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data |
title_fullStr | HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data |
title_full_unstemmed | HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data |
title_short | HLAProfiler utilizes k-mer profiles to improve HLA calling accuracy for rare and common alleles in RNA-seq data |
title_sort | hlaprofiler utilizes k mer profiles to improve hla calling accuracy for rare and common alleles in rna seq data |
topic | HLA HSCT Transplantation Immunology RNA-sequencing |
url | http://link.springer.com/article/10.1186/s13073-017-0473-6 |
work_keys_str_mv | AT martinlbuchkovich hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT chadcbrown hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT kimberlyrobasky hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT shengjiechai hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT sharonwestfall hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT benjamingvincent hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT erictweimer hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata AT jasongpowers hlaprofilerutilizeskmerprofilestoimprovehlacallingaccuracyforrareandcommonallelesinrnaseqdata |