Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs

For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective adv...

Full description

Bibliographic Details
Main Authors: Junyang Liu, Fang Liu, Weihua Pan
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/14/10/1926
_version_ 1797626762922819584
author Junyang Liu
Fang Liu
Weihua Pan
author_facet Junyang Liu
Fang Liu
Weihua Pan
author_sort Junyang Liu
collection DOAJ
description For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly.
first_indexed 2024-03-11T10:14:49Z
format Article
id doaj.art-7705d4610ec140ae915d5f7c99a23b5f
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-11T10:14:49Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-7705d4610ec140ae915d5f7c99a23b5f2023-11-16T10:29:48ZengMDPI AGGenes2073-44252023-10-011410192610.3390/genes14101926Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost ContigsJunyang Liu0Fang Liu1Weihua Pan2Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, ChinaZhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, ChinaShenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen 518120, ChinaFor a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly.https://www.mdpi.com/2073-4425/14/10/1926genome assemblygap freesingle-copy genecompleteness
spellingShingle Junyang Liu
Fang Liu
Weihua Pan
Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
Genes
genome assembly
gap free
single-copy gene
completeness
title Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_full Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_fullStr Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_full_unstemmed Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_short Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
title_sort improving the completeness of chromosome level assembly by recalling sequences from lost contigs
topic genome assembly
gap free
single-copy gene
completeness
url https://www.mdpi.com/2073-4425/14/10/1926
work_keys_str_mv AT junyangliu improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs
AT fangliu improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs
AT weihuapan improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs