Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus
Abstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms i...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2017-08-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-017-08139-y |
_version_ | 1818749375395921920 |
---|---|
author | Liting Song Wenxun Huang Juan Kang Yuan Huang Hong Ren Keyue Ding |
author_facet | Liting Song Wenxun Huang Juan Kang Yuan Huang Hong Ren Keyue Ding |
author_sort | Liting Song |
collection | DOAJ |
description | Abstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms in PGM viral datasets (e.g., hepatitis B virus (HBV)) has not been performed. We examined 19 quality-trimmed PGM datasets for the HBV reverse transcriptase (RT) region and found a total error rate of 0.48% ± 0.12%. Deletion errors were clearly present at the ends of homopolymer runs. Tests using both real and simulated data showed that the algorithms differed in their abilities to detect and correct errors and that the error rate and sequencing depth significantly affected the performance. Of the algorithms tested, Pollux showed a better overall performance but tended to over-correct ‘genuine’ substitution variants, whereas Fiona proved to be better at distinguishing these variants from sequencing errors. We found that the combined use of Pollux and Fiona gave the best results when error-correcting Ion Torrent PGM viral data. |
first_indexed | 2024-12-18T04:02:47Z |
format | Article |
id | doaj.art-3ea364f13c72441ca774d4bcc1642163 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-12-18T04:02:47Z |
publishDate | 2017-08-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-3ea364f13c72441ca774d4bcc16421632022-12-21T21:21:39ZengNature PortfolioScientific Reports2045-23222017-08-017111110.1038/s41598-017-08139-yComparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virusLiting Song0Wenxun Huang1Juan Kang2Yuan Huang3Hong Ren4Keyue Ding5Key Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityCenter for Hepatobillary and Pancreatic Diseases, Beijing Tsinghua Changgung Hospital, Medical Center, Tsinghua UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityKey Laboratory of Molecular Biology for Infectious Diseases (Ministry of Education), Institute for Viral Hepatitis, Department of Infectious Diseases, The Second Affiliated Hospital, Chongqing Medical UniversityAbstract Ion Torrent Personal Genome Machine (PGM) technology is a mid-length read, low-cost and high-speed next-generation sequencing platform with a relatively high insertion and deletion (indel) error rate. A full systematic assessment of the effectiveness of various error correction algorithms in PGM viral datasets (e.g., hepatitis B virus (HBV)) has not been performed. We examined 19 quality-trimmed PGM datasets for the HBV reverse transcriptase (RT) region and found a total error rate of 0.48% ± 0.12%. Deletion errors were clearly present at the ends of homopolymer runs. Tests using both real and simulated data showed that the algorithms differed in their abilities to detect and correct errors and that the error rate and sequencing depth significantly affected the performance. Of the algorithms tested, Pollux showed a better overall performance but tended to over-correct ‘genuine’ substitution variants, whereas Fiona proved to be better at distinguishing these variants from sequencing errors. We found that the combined use of Pollux and Fiona gave the best results when error-correcting Ion Torrent PGM viral data.https://doi.org/10.1038/s41598-017-08139-y |
spellingShingle | Liting Song Wenxun Huang Juan Kang Yuan Huang Hong Ren Keyue Ding Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus Scientific Reports |
title | Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus |
title_full | Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus |
title_fullStr | Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus |
title_full_unstemmed | Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus |
title_short | Comparison of error correction algorithms for Ion Torrent PGM data: application to hepatitis B virus |
title_sort | comparison of error correction algorithms for ion torrent pgm data application to hepatitis b virus |
url | https://doi.org/10.1038/s41598-017-08139-y |
work_keys_str_mv | AT litingsong comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus AT wenxunhuang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus AT juankang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus AT yuanhuang comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus AT hongren comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus AT keyueding comparisonoferrorcorrectionalgorithmsforiontorrentpgmdataapplicationtohepatitisbvirus |