Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species

Abstract Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last d...

Full description

Bibliographic Details
Main Authors: Fang-Dong Li, Wei Tong, En-Hua Xia, Chao-Ling Wei
Format: Article
Language:English
Published: BMC 2019-11-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-3166-x
_version_ 1828803555904978944
author Fang-Dong Li
Wei Tong
En-Hua Xia
Chao-Ling Wei
author_facet Fang-Dong Li
Wei Tong
En-Hua Xia
Chao-Ling Wei
author_sort Fang-Dong Li
collection DOAJ
description Abstract Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many studies have generated the transcriptomes of tea plants at different developmental stages or under abiotic and/or biotic stresses to investigate the genetic basis of secondary metabolites that determine tea quality. However, these results exhibited large differences, particularly in the total number of reconstructed transcripts and the quality of the assembled transcriptomes. These differences largely result from limited knowledge regarding the optimized sequencing depth and assembler for transcriptome assembly of structurally complex plant species genomes. Results We employed different amounts of RNA-sequencing data, ranging from 4 to 84 Gb, to assemble the tea plant transcriptome using five well-known and representative transcript assemblers. Although the total number of assembled transcripts increased with increasing sequencing data, the proportion of unassembled transcripts became saturated as revealed by plant BUSCO datasets. Among the five representative assemblers, the Bridger package shows the best performance in both assembly completeness and accuracy as evaluated by the BUSCO datasets and genome alignment. In addition, we showed that Bridger and BinPacker harbored the shortest runtimes followed by SOAPdenovo and Trans-ABySS. Conclusions The present study compares the performance of five representative transcript assemblers and investigates the key factors that affect the assembly quality of the transcriptome of the tea plants. This study will be of significance in helping the tea research community obtain better sequencing and assembly of tea plant transcriptomes under conditions of interest and may thus help to answer major biological questions currently facing the tea industry.
first_indexed 2024-12-12T07:25:24Z
format Article
id doaj.art-f1c776052708400ba4b3f5053b17d5f7
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-12T07:25:24Z
publishDate 2019-11-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-f1c776052708400ba4b3f5053b17d5f72022-12-22T00:33:11ZengBMCBMC Bioinformatics1471-21052019-11-0120111110.1186/s12859-019-3166-xOptimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant speciesFang-Dong Li0Wei Tong1En-Hua Xia2Chao-Ling Wei3State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural UniversityState Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural UniversityState Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural UniversityState Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural UniversityAbstract Background Tea is the oldest and among the world’s most popular non-alcoholic beverages, which has important economic, health and cultural values. Tea is commonly produced from the leaves of tea plants (Camellia sinensis), which belong to the genus Camellia of family Theaceae. In the last decade, many studies have generated the transcriptomes of tea plants at different developmental stages or under abiotic and/or biotic stresses to investigate the genetic basis of secondary metabolites that determine tea quality. However, these results exhibited large differences, particularly in the total number of reconstructed transcripts and the quality of the assembled transcriptomes. These differences largely result from limited knowledge regarding the optimized sequencing depth and assembler for transcriptome assembly of structurally complex plant species genomes. Results We employed different amounts of RNA-sequencing data, ranging from 4 to 84 Gb, to assemble the tea plant transcriptome using five well-known and representative transcript assemblers. Although the total number of assembled transcripts increased with increasing sequencing data, the proportion of unassembled transcripts became saturated as revealed by plant BUSCO datasets. Among the five representative assemblers, the Bridger package shows the best performance in both assembly completeness and accuracy as evaluated by the BUSCO datasets and genome alignment. In addition, we showed that Bridger and BinPacker harbored the shortest runtimes followed by SOAPdenovo and Trans-ABySS. Conclusions The present study compares the performance of five representative transcript assemblers and investigates the key factors that affect the assembly quality of the transcriptome of the tea plants. This study will be of significance in helping the tea research community obtain better sequencing and assembly of tea plant transcriptomes under conditions of interest and may thus help to answer major biological questions currently facing the tea industry.http://link.springer.com/article/10.1186/s12859-019-3166-xTea plantCamellia sinensisTranscriptomede novo assemblySequencing depth
spellingShingle Fang-Dong Li
Wei Tong
En-Hua Xia
Chao-Ling Wei
Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
BMC Bioinformatics
Tea plant
Camellia sinensis
Transcriptome
de novo assembly
Sequencing depth
title Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
title_full Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
title_fullStr Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
title_full_unstemmed Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
title_short Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species
title_sort optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant an economically important plant species
topic Tea plant
Camellia sinensis
Transcriptome
de novo assembly
Sequencing depth
url http://link.springer.com/article/10.1186/s12859-019-3166-x
work_keys_str_mv AT fangdongli optimizedsequencingdepthanddenovoassemblerfordeeplyreconstructingthetranscriptomeoftheteaplantaneconomicallyimportantplantspecies
AT weitong optimizedsequencingdepthanddenovoassemblerfordeeplyreconstructingthetranscriptomeoftheteaplantaneconomicallyimportantplantspecies
AT enhuaxia optimizedsequencingdepthanddenovoassemblerfordeeplyreconstructingthetranscriptomeoftheteaplantaneconomicallyimportantplantspecies
AT chaolingwei optimizedsequencingdepthanddenovoassemblerfordeeplyreconstructingthetranscriptomeoftheteaplantaneconomicallyimportantplantspecies