vi-HMM: a novel HMM-based method for sequence variant identification in short-read data

Abstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often...

Full description

Bibliographic Details
Main Authors: Man Tang, Mohammad Shabbir Hasan, Hongxiao Zhu, Liqing Zhang, Xiaowei Wu
Format: Article
Language:English
Published: BMC 2019-02-01
Series:Human Genomics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40246-019-0194-6
_version_ 1828281442387361792
author Man Tang
Mohammad Shabbir Hasan
Hongxiao Zhu
Liqing Zhang
Xiaowei Wu
author_facet Man Tang
Mohammad Shabbir Hasan
Hongxiao Zhu
Liqing Zhang
Xiaowei Wu
author_sort Man Tang
collection DOAJ
description Abstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F 1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs.
first_indexed 2024-04-13T08:15:39Z
format Article
id doaj.art-67755499f4a04108a7c86276a2652f84
institution Directory Open Access Journal
issn 1479-7364
language English
last_indexed 2024-04-13T08:15:39Z
publishDate 2019-02-01
publisher BMC
record_format Article
series Human Genomics
spelling doaj.art-67755499f4a04108a7c86276a2652f842022-12-22T02:54:48ZengBMCHuman Genomics1479-73642019-02-0113111210.1186/s40246-019-0194-6vi-HMM: a novel HMM-based method for sequence variant identification in short-read dataMan Tang0Mohammad Shabbir Hasan1Hongxiao Zhu2Liqing Zhang3Xiaowei Wu4Department of Statistics, Virginia TechDepartment of Computer Science, Virginia TechDepartment of Statistics, Virginia TechDepartment of Computer Science, Virginia TechDepartment of Statistics, Virginia TechAbstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F 1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs.http://link.springer.com/article/10.1186/s40246-019-0194-6HMMVariant callingSNPINDELViterbi algorithm
spellingShingle Man Tang
Mohammad Shabbir Hasan
Hongxiao Zhu
Liqing Zhang
Xiaowei Wu
vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
Human Genomics
HMM
Variant calling
SNP
INDEL
Viterbi algorithm
title vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
title_full vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
title_fullStr vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
title_full_unstemmed vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
title_short vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
title_sort vi hmm a novel hmm based method for sequence variant identification in short read data
topic HMM
Variant calling
SNP
INDEL
Viterbi algorithm
url http://link.springer.com/article/10.1186/s40246-019-0194-6
work_keys_str_mv AT mantang vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata
AT mohammadshabbirhasan vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata
AT hongxiaozhu vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata
AT liqingzhang vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata
AT xiaoweiwu vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata