Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus

Abstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms i...

Full description

Bibliographic Details
Main Authors: Liting Song, Wenxun Huang, Juan Kang, Yuan Huang, Hong Ren, Keyue Ding
Format: Article
Language:English
Published: Nature Portfolio 2017-08-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-017-08139-y
_version_ 1818749375395921920
author Liting Song
Wenxun Huang
Juan Kang
Yuan Huang
Hong Ren
Keyue Ding
author_facet Liting Song
Wenxun Huang
Juan Kang
Yuan Huang
Hong Ren
Keyue Ding
author_sort Liting Song
collection DOAJ
description Abstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms in PGM viral datasets (e.g., hepatitis B virus (HBV)) has not been performed. We examined 19 quality-trimmed PGM datasets for the HBV reverse transcriptase (RT) region and found a total error rate of 0.48% ± 0.12%. Deletion errors were clearly present at the ends of homopolymer runs. Tests using both real and simulated data showed that the algorithms differed in their abilities to detect and correct errors and that the error rate and sequencing depth significantly affected the performance. Of the algorithms tested, Pollux showed a better overall performance but tended to over-correct ‘genuine’ substitution variants, whereas Fiona proved to be better at distinguishing these variants from sequencing errors. We found that the combined use of Pollux and Fiona gave the best results when error-correcting Ion Torrent PGM viral data.
first_indexed 2024-12-18T04:02:47Z
format Article
id doaj.art-3ea364f13c72441ca774d4bcc1642163
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-12-18T04:02:47Z
publishDate 2017-08-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-3ea364f13c72441ca774d4bcc16421632022-12-21T21:21:39ZengNature PortfolioScientific Reports2045-23222017-08-017111110.1038/s41598-017-08139-yComparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virusLiting Song0Wenxun Huang1Juan Kang2Yuan Huang3Hong Ren4Keyue Ding5Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityCenter for Hepatobillary and Pancreatic Diseases, Beijing Tsinghua Changgung Hospital, Medical Center, Tsinghua UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityAbstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms in PGM viral datasets (e.g., hepatitis B virus (HBV)) has not been performed. We examined 19 quality-trimmed PGM datasets for the HBV reverse transcriptase (RT) region and found a total error rate of 0.48% ± 0.12%. Deletion errors were clearly present at the ends of homopolymer runs. Tests using both real and simulated data showed that the algorithms differed in their abilities to detect and correct errors and that the error rate and sequencing depth significantly affected the performance. Of the algorithms tested, Pollux showed a better overall performance but tended to over-correct ‘genuine’ substitution variants, whereas Fiona proved to be better at distinguishing these variants from sequencing errors. We found that the combined use of Pollux and Fiona gave the best results when error-correcting Ion Torrent PGM viral data.https://doi.org/10.1038/s41598-017-08139-y
spellingShingle Liting Song
Wenxun Huang
Juan Kang
Yuan Huang
Hong Ren
Keyue Ding
Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
Scientific Reports
title Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
title_full Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
title_fullStr Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
title_full_unstemmed Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
title_short Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
title_sort comparison of error correction algorithms for ion torrent pgm data application to hepatitis b virus
url https://doi.org/10.1038/s41598-017-08139-y
work_keys_str_mv AT litingsong comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus
AT wenxunhuang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus
AT juankang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus
AT yuanhuang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus
AT hongren comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus
AT keyueding comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus