De novo transcriptome assembly of Sorghum bicolor variety Taejin

Sorghum (Sorghum bicolor), also known as great millet, is one of the most popular cultivated grass species in the world. Sorghum is frequently consumed as food for humans and animals as well as used for ethanol production. In this study, we conducted de novo transcriptome assembly for sorghum variet...

Full description

Bibliographic Details
Main Authors: Yeonhwa Jo, Sen Lian, Jin Kyong Cho, Hoseong Choi, Sang-Min Kim, Sun-Lim Kim, Bong Choon Lee, Won Kyong Cho
Format: Article
Language:English
Published: Elsevier 2016-06-01
Series:Genomics Data
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2213596016300617
_version_ 1818298473537077248
author Yeonhwa Jo
Sen Lian
Jin Kyong Cho
Hoseong Choi
Sang-Min Kim
Sun-Lim Kim
Bong Choon Lee
Won Kyong Cho
author_facet Yeonhwa Jo
Sen Lian
Jin Kyong Cho
Hoseong Choi
Sang-Min Kim
Sun-Lim Kim
Bong Choon Lee
Won Kyong Cho
author_sort Yeonhwa Jo
collection DOAJ
description Sorghum (Sorghum bicolor), also known as great millet, is one of the most popular cultivated grass species in the world. Sorghum is frequently consumed as food for humans and animals as well as used for ethanol production. In this study, we conducted de novo transcriptome assembly for sorghum variety Taejin by next-generation sequencing, obtaining 8.748 GB of raw data. The raw data in this study can be available in NCBI SRA database with accession number of SRX1715644. Using the Trinity program, we identified 222,161 transcripts from sorghum variety Taejin. We further predicted coding regions within the assembled transcripts by the TransDecoder program, resulting in a total of 148,531 proteins. We carried out BLASTP against the Swiss-Prot protein sequence database to annotate the functions of the identified proteins. To our knowledge, this is the first transcriptome data for a sorghum variety derived from Korea, and it can be usefully applied to the generation of genetic markers.
first_indexed 2024-12-13T04:35:53Z
format Article
id doaj.art-ae7e2bf43e75487190ac15ef6dc2e722
institution Directory Open Access Journal
issn 2213-5960
language English
last_indexed 2024-12-13T04:35:53Z
publishDate 2016-06-01
publisher Elsevier
record_format Article
series Genomics Data
spelling doaj.art-ae7e2bf43e75487190ac15ef6dc2e7222022-12-21T23:59:26ZengElsevierGenomics Data2213-59602016-06-018C11711810.1016/j.gdata.2016.05.002De novo transcriptome assembly of Sorghum bicolor variety TaejinYeonhwa Jo0Sen Lian1Jin Kyong Cho2Hoseong Choi3Sang-Min Kim4Sun-Lim Kim5Bong Choon Lee6Won Kyong Cho7Department of Agricultural Biotechnology, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Republic of KoreaCollege of Crop Protection and Agronomy, Qingdao Agricultural University, Qingdao, Shandong 266109, ChinaThe Taejin Genome Institute, Gadam-gil 61, Hoengseong, 25239, Republic of KoreaDepartment of Agricultural Biotechnology, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Republic of KoreaCrop Foundation Division, National Institute of Crop Science, RDA, Wanju, 55365, Republic of KoreaCrop Foundation Division, National Institute of Crop Science, RDA, Wanju, 55365, Republic of KoreaCrop Foundation Division, National Institute of Crop Science, RDA, Wanju, 55365, Republic of KoreaDepartment of Agricultural Biotechnology, College of Agriculture and Life Sciences, Seoul National University, Seoul 151-921, Republic of KoreaSorghum (Sorghum bicolor), also known as great millet, is one of the most popular cultivated grass species in the world. Sorghum is frequently consumed as food for humans and animals as well as used for ethanol production. In this study, we conducted de novo transcriptome assembly for sorghum variety Taejin by next-generation sequencing, obtaining 8.748 GB of raw data. The raw data in this study can be available in NCBI SRA database with accession number of SRX1715644. Using the Trinity program, we identified 222,161 transcripts from sorghum variety Taejin. We further predicted coding regions within the assembled transcripts by the TransDecoder program, resulting in a total of 148,531 proteins. We carried out BLASTP against the Swiss-Prot protein sequence database to annotate the functions of the identified proteins. To our knowledge, this is the first transcriptome data for a sorghum variety derived from Korea, and it can be usefully applied to the generation of genetic markers.http://www.sciencedirect.com/science/article/pii/S2213596016300617RNA-SeqSorghum bicolorTranscriptomeVariety
spellingShingle Yeonhwa Jo
Sen Lian
Jin Kyong Cho
Hoseong Choi
Sang-Min Kim
Sun-Lim Kim
Bong Choon Lee
Won Kyong Cho
De novo transcriptome assembly of Sorghum bicolor variety Taejin
Genomics Data
RNA-Seq
Sorghum bicolor
Transcriptome
Variety
title De novo transcriptome assembly of Sorghum bicolor variety Taejin
title_full De novo transcriptome assembly of Sorghum bicolor variety Taejin
title_fullStr De novo transcriptome assembly of Sorghum bicolor variety Taejin
title_full_unstemmed De novo transcriptome assembly of Sorghum bicolor variety Taejin
title_short De novo transcriptome assembly of Sorghum bicolor variety Taejin
title_sort de novo transcriptome assembly of sorghum bicolor variety taejin
topic RNA-Seq
Sorghum bicolor
Transcriptome
Variety
url http://www.sciencedirect.com/science/article/pii/S2213596016300617
work_keys_str_mv AT yeonhwajo denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT senlian denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT jinkyongcho denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT hoseongchoi denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT sangminkim denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT sunlimkim denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT bongchoonlee denovotranscriptomeassemblyofsorghumbicolorvarietytaejin
AT wonkyongcho denovotranscriptomeassemblyofsorghumbicolorvarietytaejin