Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs
For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective adv...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-10-01
|
Series: | Genes |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4425/14/10/1926 |
_version_ | 1797626762922819584 |
---|---|
author | Junyang Liu Fang Liu Weihua Pan |
author_facet | Junyang Liu Fang Liu Weihua Pan |
author_sort | Junyang Liu |
collection | DOAJ |
description | For a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly. |
first_indexed | 2024-03-11T10:14:49Z |
format | Article |
id | doaj.art-7705d4610ec140ae915d5f7c99a23b5f |
institution | Directory Open Access Journal |
issn | 2073-4425 |
language | English |
last_indexed | 2024-03-11T10:14:49Z |
publishDate | 2023-10-01 |
publisher | MDPI AG |
record_format | Article |
series | Genes |
spelling | doaj.art-7705d4610ec140ae915d5f7c99a23b5f2023-11-16T10:29:48ZengMDPI AGGenes2073-44252023-10-011410192610.3390/genes14101926Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost ContigsJunyang Liu0Fang Liu1Weihua Pan2Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, ChinaZhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou 450001, ChinaShenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen 518120, ChinaFor a long time, the construction of complete reference genomes for complex eukaryotic genomes has been hindered by the limitations of sequencing technologies. Recently, the Pacific Biosciences (PacBio) HiFi data and Oxford Nanopore Technologies (ONT) Ultra-Long data, leveraging their respective advantages in accuracy and length, have provided an opportunity for generating complete chromosome sequences. Nevertheless, for the majority of genomes, the chromosome-level assemblies generated using existing methods still miss a high proportion of sequences due to losing small contigs in the step of assembly and scaffolding. To address this shortcoming, in this paper, we propose a novel method that is able to identify and fill the gaps in the chromosome-level assembly by recalling the sequences in the lost small contigs. Experimental results on both real and simulated datasets demonstrate that this method is able to improve the completeness of the chromosome-level assembly.https://www.mdpi.com/2073-4425/14/10/1926genome assemblygap freesingle-copy genecompleteness |
spellingShingle | Junyang Liu Fang Liu Weihua Pan Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs Genes genome assembly gap free single-copy gene completeness |
title | Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs |
title_full | Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs |
title_fullStr | Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs |
title_full_unstemmed | Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs |
title_short | Improving the Completeness of Chromosome-Level Assembly by Recalling Sequences from Lost Contigs |
title_sort | improving the completeness of chromosome level assembly by recalling sequences from lost contigs |
topic | genome assembly gap free single-copy gene completeness |
url | https://www.mdpi.com/2073-4425/14/10/1926 |
work_keys_str_mv | AT junyangliu improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs AT fangliu improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs AT weihuapan improvingthecompletenessofchromosomelevelassemblybyrecallingsequencesfromlostcontigs |