Illumina reads correction: evaluation and improvements

Abstract The paper focuses on the correction of Illumina WGS sequencing reads. We provide an extensive evaluation of the existing correctors. To this end, we measure an impact of the correction on variant calling (VC) as well as de novo assembly. It shows, that in selected cases read correction impr...

Full description

Bibliographic Details
Main Authors: Maciej Długosz, Sebastian Deorowicz
Format: Article
Language:English
Published: Nature Portfolio 2024-01-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-024-52386-9
Description
Summary:Abstract The paper focuses on the correction of Illumina WGS sequencing reads. We provide an extensive evaluation of the existing correctors. To this end, we measure an impact of the correction on variant calling (VC) as well as de novo assembly. It shows, that in selected cases read correction improves the VC results quality. We also examine the algorithms behaviour in a processing of Illumina NovaSeq reads, with different reads quality characteristics than in older sequencers. We show that most of the algorithms are ready to cope with such reads. Finally, we introduce a new version of RECKONER, our read corrector, by optimizing it and equipping with a new correction strategy. Currently, RECKONER allows to correct high-coverage human reads in less than 2.5 h, is able to cope with two types of reads errors: indels and substitutions, and utilizes a new, based on a two lengths of oligomers, correction verification technique.
ISSN:2045-2322