High-quality faba bean reference transcripts generated using PacBio and Illumina RNA-seq data

Abstract The genome of faba bean was first published in 2023. To promote future molecular breeding studies, we improved the quality of the faba genome based on high-density genetic maps and the Illumina and Pacbio RNA-seq datasets. Two high-density genetic maps were used to conduct the scaffold orde...

Full description

Bibliographic Details
Main Authors: Na Zhao, Enqiang Zhou, Yamei Miao, Dong Xue, Yongqiang Wang, Kaihua Wang, Chunyan Gu, Mengnan Yao, Yao Zhou, Bo Li, Xuejun Wang, Libin Wei
Format: Article
Language:English
Published: Nature Portfolio 2024-04-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-03204-4
Description
Summary:Abstract The genome of faba bean was first published in 2023. To promote future molecular breeding studies, we improved the quality of the faba genome based on high-density genetic maps and the Illumina and Pacbio RNA-seq datasets. Two high-density genetic maps were used to conduct the scaffold ordering and orientation of faba bean, culminating in an increased length (i.e., 14.28 Mbp) of chromosomes and a decrease in the number of scaffolds by 45. In gene model mining and optimisation, the PacBio and Illumina RNA-seq datasets from 37 samples allowed for the identification and correction 121,606 transcripts, and the data facilitated a prediction of 15,640 alternative splicing events, 2,148 lncRNAs, and 1,752 fusion transcripts, thus allowing for a clearer understanding of the gene structures underlying the faba genome. Moreover, a total of 38,850 new genes including 56,188 transcripts were identified compared with the reference genome. Finally, the genetic data of the reference genome was integrated and a comprehensive and complete faba bean transcriptome sequence of 103,267 transcripts derived from 54,753 uni-genes was formed.
ISSN:2052-4463