Statistical resolution of ambiguous HLA typing data

High-resolution HLA typing plays a central role in many areas of immunology, such as in identifying immunogenetic risk factors for disease, in studying how the genomes of pathogens evolve in response to immune selection pressures, and also in vaccine design, where identification of HLA-restricted ep...

Full description

Bibliographic Details
Main Authors: Listgarten, J, Brumme, Z, Kadie, C, Xiaojiang, G, Walker, B, Carrington, M, Goulder, P, Heckerman, D
Format: Journal article
Language:English
Published: Public Library of Science 2008
_version_ 1797057554955632640
author Listgarten, J
Brumme, Z
Kadie, C
Xiaojiang, G
Walker, B
Carrington, M
Goulder, P
Heckerman, D
author_facet Listgarten, J
Brumme, Z
Kadie, C
Xiaojiang, G
Walker, B
Carrington, M
Goulder, P
Heckerman, D
author_sort Listgarten, J
collection OXFORD
description High-resolution HLA typing plays a central role in many areas of immunology, such as in identifying immunogenetic risk factors for disease, in studying how the genomes of pathogens evolve in response to immune selection pressures, and also in vaccine design, where identification of HLA-restricted epitopes may be used to guide the selection of vaccine immunogens. Perhaps one of the most immediate applications is in direct medical decisions concerning the matching of stem cell transplant donors to unrelated recipients. However, high-resolution HLA typing is frequently unavailable due to its high cost or the inability to re-type historical data. In this paper, we introduce and evaluate a method for statistical, in silico refinement of ambiguous and/or low-resolution HLA data. Our method, which requires an independent, high-resolution training data set drawn from the same population as the data to be refined, uses linkage disequilibrium in HLA haplotypes as well as four-digit allele frequency data to probabilistically refine HLA typings. Central to our approach is the use of haplotype inference. We introduce new methodology to this area, improving upon the Expectation-Maximization (EM)-based approaches currently used within the HLA community. Our improvements are achieved by using a parsimonious parameterization for haplotype distributions and by smoothing the maximum likelihood (ML) solution. These improvements make it possible to scale the refinement to a larger number of alleles and loci in a more computationally efficient and stable manner. We also show how to augment our method in order to incorporate ethnicity information (as HLA allele distributions vary widely according to race/ethnicity as well as geographic area), and demonstrate the potential utility of this experimentally. A tool based on our approach is freely available for research purposes at http://microsoft.com/science.
first_indexed 2024-03-06T19:38:06Z
format Journal article
id oxford-uuid:1fb79477-8a28-4c85-8bfe-ead96063ed98
institution University of Oxford
language English
last_indexed 2024-03-06T19:38:06Z
publishDate 2008
publisher Public Library of Science
record_format dspace
spelling oxford-uuid:1fb79477-8a28-4c85-8bfe-ead96063ed982022-03-26T11:23:34ZStatistical resolution of ambiguous HLA typing dataJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:1fb79477-8a28-4c85-8bfe-ead96063ed98EnglishSymplectic Elements at OxfordPublic Library of Science2008Listgarten, JBrumme, ZKadie, CXiaojiang, GWalker, BCarrington, MGoulder, PHeckerman, DHigh-resolution HLA typing plays a central role in many areas of immunology, such as in identifying immunogenetic risk factors for disease, in studying how the genomes of pathogens evolve in response to immune selection pressures, and also in vaccine design, where identification of HLA-restricted epitopes may be used to guide the selection of vaccine immunogens. Perhaps one of the most immediate applications is in direct medical decisions concerning the matching of stem cell transplant donors to unrelated recipients. However, high-resolution HLA typing is frequently unavailable due to its high cost or the inability to re-type historical data. In this paper, we introduce and evaluate a method for statistical, in silico refinement of ambiguous and/or low-resolution HLA data. Our method, which requires an independent, high-resolution training data set drawn from the same population as the data to be refined, uses linkage disequilibrium in HLA haplotypes as well as four-digit allele frequency data to probabilistically refine HLA typings. Central to our approach is the use of haplotype inference. We introduce new methodology to this area, improving upon the Expectation-Maximization (EM)-based approaches currently used within the HLA community. Our improvements are achieved by using a parsimonious parameterization for haplotype distributions and by smoothing the maximum likelihood (ML) solution. These improvements make it possible to scale the refinement to a larger number of alleles and loci in a more computationally efficient and stable manner. We also show how to augment our method in order to incorporate ethnicity information (as HLA allele distributions vary widely according to race/ethnicity as well as geographic area), and demonstrate the potential utility of this experimentally. A tool based on our approach is freely available for research purposes at http://microsoft.com/science.
spellingShingle Listgarten, J
Brumme, Z
Kadie, C
Xiaojiang, G
Walker, B
Carrington, M
Goulder, P
Heckerman, D
Statistical resolution of ambiguous HLA typing data
title Statistical resolution of ambiguous HLA typing data
title_full Statistical resolution of ambiguous HLA typing data
title_fullStr Statistical resolution of ambiguous HLA typing data
title_full_unstemmed Statistical resolution of ambiguous HLA typing data
title_short Statistical resolution of ambiguous HLA typing data
title_sort statistical resolution of ambiguous hla typing data
work_keys_str_mv AT listgartenj statisticalresolutionofambiguoushlatypingdata
AT brummez statisticalresolutionofambiguoushlatypingdata
AT kadiec statisticalresolutionofambiguoushlatypingdata
AT xiaojiangg statisticalresolutionofambiguoushlatypingdata
AT walkerb statisticalresolutionofambiguoushlatypingdata
AT carringtonm statisticalresolutionofambiguoushlatypingdata
AT goulderp statisticalresolutionofambiguoushlatypingdata
AT heckermand statisticalresolutionofambiguoushlatypingdata