vi-HMM: a novel HMM-based method for sequence variant identification in short-read data
Abstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-02-01
|
Series: | Human Genomics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s40246-019-0194-6 |
_version_ | 1828281442387361792 |
---|---|
author | Man Tang Mohammad Shabbir Hasan Hongxiao Zhu Liqing Zhang Xiaowei Wu |
author_facet | Man Tang Mohammad Shabbir Hasan Hongxiao Zhu Liqing Zhang Xiaowei Wu |
author_sort | Man Tang |
collection | DOAJ |
description | Abstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F 1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. |
first_indexed | 2024-04-13T08:15:39Z |
format | Article |
id | doaj.art-67755499f4a04108a7c86276a2652f84 |
institution | Directory Open Access Journal |
issn | 1479-7364 |
language | English |
last_indexed | 2024-04-13T08:15:39Z |
publishDate | 2019-02-01 |
publisher | BMC |
record_format | Article |
series | Human Genomics |
spelling | doaj.art-67755499f4a04108a7c86276a2652f842022-12-22T02:54:48ZengBMCHuman Genomics1479-73642019-02-0113111210.1186/s40246-019-0194-6vi-HMM: a novel HMM-based method for sequence variant identification in short-read dataMan Tang0Mohammad Shabbir Hasan1Hongxiao Zhu2Liqing Zhang3Xiaowei Wu4Department of Statistics, Virginia TechDepartment of Computer Science, Virginia TechDepartment of Statistics, Virginia TechDepartment of Computer Science, Virginia TechDepartment of Statistics, Virginia TechAbstract Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F 1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs.http://link.springer.com/article/10.1186/s40246-019-0194-6HMMVariant callingSNPINDELViterbi algorithm |
spellingShingle | Man Tang Mohammad Shabbir Hasan Hongxiao Zhu Liqing Zhang Xiaowei Wu vi-HMM: a novel HMM-based method for sequence variant identification in short-read data Human Genomics HMM Variant calling SNP INDEL Viterbi algorithm |
title | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_full | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_fullStr | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_full_unstemmed | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_short | vi-HMM: a novel HMM-based method for sequence variant identification in short-read data |
title_sort | vi hmm a novel hmm based method for sequence variant identification in short read data |
topic | HMM Variant calling SNP INDEL Viterbi algorithm |
url | http://link.springer.com/article/10.1186/s40246-019-0194-6 |
work_keys_str_mv | AT mantang vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT mohammadshabbirhasan vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT hongxiaozhu vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT liqingzhang vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata AT xiaoweiwu vihmmanovelhmmbasedmethodforsequencevariantidentificationinshortreaddata |