A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns

The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inf...

Full description

Bibliographic Details
Main Authors: KUI eZHANG, Jihua eWu, Guo-bo eChen, Degui eZhi, Nianjun eLiu
Format: Article
Language:English
Published: Frontiers Media S.A. 2014-08-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00267/full
_version_ 1819178744804278272
author KUI eZHANG
Jihua eWu
Guo-bo eChen
Degui eZhi
Nianjun eLiu
author_facet KUI eZHANG
Jihua eWu
Guo-bo eChen
Degui eZhi
Nianjun eLiu
author_sort KUI eZHANG
collection DOAJ
description The majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inference for these genes is becoming more challenging due to such large portion of missing information. Meantime, many haplotypes and partial haplotype patterns have been previously identified due to tight linkage disequilibrium (LD) among these clustered genes thus can be incorporated to facilitate haplotype inference. In this paper, we developed a hidden Markov model (HMM) based method that can incorporate identified haplotypes or partial haplotype patterns for haplotype inference from present-absent data of clustered genes (e.g., KIR genes). We compared its performance with an expectation maximization (EM) based method previously developed in terms of haplotype assignments and haplotype frequency estimation through extensive simulations for KIR genes. The simulation results showed that the new HMM based method outperformed the previous method when some incorrect haplotypes were included as identified haplotypes and/or the standard deviation of haplotype frequencies were small. We also compared the performance of our method with two methods that do not use previously identified haplotypes and haplotype patterns, including an EM based method, HPALORE, and a HMM based method, MaCH. Our simulation results showed that the incorporation of identified haplotypes and partial haplotype patterns can improve accuracy for haplotype inference. The new software package HaploHMM is available and can be downloaded at http://www.soph.uab.edu/ssg/files/People/KZhang/HaploHMM/haplohmm-index.html.
first_indexed 2024-12-22T21:47:25Z
format Article
id doaj.art-a3ab72ab3a384e4ca2646267ac243b77
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-12-22T21:47:25Z
publishDate 2014-08-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-a3ab72ab3a384e4ca2646267ac243b772022-12-21T18:11:28ZengFrontiers Media S.A.Frontiers in Genetics1664-80212014-08-01510.3389/fgene.2014.0026797286A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patternsKUI eZHANG0Jihua eWu1Guo-bo eChen2Degui eZhi3Nianjun eLiu4University of Alabama at BirminghamUniversity of Alabama at BirminghamThe University of Queensland, Queensland Brain InstituteUniversity of Alabama at BirminghamUniversity of Alabama at BirminghamThe majority of killer cell immunoglobin-like receptor (KIR) genes are detected as either present or absent using locus-specific genotyping technology. Ambiguity arises from the presence of a specific KIR gene since the exact copy number (one or two) of that gene is unknown. Therefore, haplotype inference for these genes is becoming more challenging due to such large portion of missing information. Meantime, many haplotypes and partial haplotype patterns have been previously identified due to tight linkage disequilibrium (LD) among these clustered genes thus can be incorporated to facilitate haplotype inference. In this paper, we developed a hidden Markov model (HMM) based method that can incorporate identified haplotypes or partial haplotype patterns for haplotype inference from present-absent data of clustered genes (e.g., KIR genes). We compared its performance with an expectation maximization (EM) based method previously developed in terms of haplotype assignments and haplotype frequency estimation through extensive simulations for KIR genes. The simulation results showed that the new HMM based method outperformed the previous method when some incorrect haplotypes were included as identified haplotypes and/or the standard deviation of haplotype frequencies were small. We also compared the performance of our method with two methods that do not use previously identified haplotypes and haplotype patterns, including an EM based method, HPALORE, and a HMM based method, MaCH. Our simulation results showed that the incorporation of identified haplotypes and partial haplotype patterns can improve accuracy for haplotype inference. The new software package HaploHMM is available and can be downloaded at http://www.soph.uab.edu/ssg/files/People/KZhang/HaploHMM/haplohmm-index.html.http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00267/fullhaplotypeHidden markov modelhaplotype inferenceKIR genesHaplotype Patterns
spellingShingle KUI eZHANG
Jihua eWu
Guo-bo eChen
Degui eZhi
Nianjun eLiu
A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
Frontiers in Genetics
haplotype
Hidden markov model
haplotype inference
KIR genes
Haplotype Patterns
title A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_full A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_fullStr A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_full_unstemmed A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_short A hidden Markov model for haplotype inference for present-absent data of clustered genes using identified haplotypes and haplotype patterns
title_sort hidden markov model for haplotype inference for present absent data of clustered genes using identified haplotypes and haplotype patterns
topic haplotype
Hidden markov model
haplotype inference
KIR genes
Haplotype Patterns
url http://journal.frontiersin.org/Journal/10.3389/fgene.2014.00267/full
work_keys_str_mv AT kuiezhang ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT jihuaewu ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT guoboechen ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT deguiezhi ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT nianjuneliu ahiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT kuiezhang hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT jihuaewu hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT guoboechen hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT deguiezhi hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns
AT nianjuneliu hiddenmarkovmodelforhaplotypeinferenceforpresentabsentdataofclusteredgenesusingidentifiedhaplotypesandhaplotypepatterns