REAPR: a universal tool for genome assembly evaluation.

Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We hav...

Full description

Bibliographic Details
Main Authors: Hunt, M, Kikuchi, T, Sanders, M, Newbold, C, Berriman, M, Otto, T
Format: Journal article
Language:English
Published: 2013
_version_ 1797080424205254656
author Hunt, M
Kikuchi, T
Sanders, M
Newbold, C
Berriman, M
Otto, T
author_facet Hunt, M
Kikuchi, T
Sanders, M
Newbold, C
Berriman, M
Otto, T
author_sort Hunt, M
collection OXFORD
description Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
first_indexed 2024-03-07T00:59:51Z
format Journal article
id oxford-uuid:895585a4-961e-488a-8642-5b23cfcd46e4
institution University of Oxford
language English
last_indexed 2024-03-07T00:59:51Z
publishDate 2013
record_format dspace
spelling oxford-uuid:895585a4-961e-488a-8642-5b23cfcd46e42022-03-26T22:23:50ZREAPR: a universal tool for genome assembly evaluation.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:895585a4-961e-488a-8642-5b23cfcd46e4EnglishSymplectic Elements at Oxford2013Hunt, MKikuchi, TSanders, MNewbold, CBerriman, MOtto, TMethods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.
spellingShingle Hunt, M
Kikuchi, T
Sanders, M
Newbold, C
Berriman, M
Otto, T
REAPR: a universal tool for genome assembly evaluation.
title REAPR: a universal tool for genome assembly evaluation.
title_full REAPR: a universal tool for genome assembly evaluation.
title_fullStr REAPR: a universal tool for genome assembly evaluation.
title_full_unstemmed REAPR: a universal tool for genome assembly evaluation.
title_short REAPR: a universal tool for genome assembly evaluation.
title_sort reapr a universal tool for genome assembly evaluation
work_keys_str_mv AT huntm reaprauniversaltoolforgenomeassemblyevaluation
AT kikuchit reaprauniversaltoolforgenomeassemblyevaluation
AT sandersm reaprauniversaltoolforgenomeassemblyevaluation
AT newboldc reaprauniversaltoolforgenomeassemblyevaluation
AT berrimanm reaprauniversaltoolforgenomeassemblyevaluation
AT ottot reaprauniversaltoolforgenomeassemblyevaluation