Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice

Abstract Background The circum-basmati group of cultivated Asian rice (Oryza sativa) contains many iconic varieties and is widespread in the Indian subcontinent. Despite its economic and cultural importance, a high-quality reference genome is currently lacking, and the group’s evolutionary history i...

Full description

Bibliographic Details
Main Authors: Jae Young Choi, Zoe N. Lye, Simon C. Groen, Xiaoguang Dai, Priyesh Rughani, Sophie Zaaijer, Eoghan D. Harrington, Sissel Juul, Michael D. Purugganan
Format: Article
Language:English
Published: BMC 2020-02-01
Series:Genome Biology
Subjects:
Online Access:https://doi.org/10.1186/s13059-020-1938-2
_version_ 1818459689658089472
author Jae Young Choi
Zoe N. Lye
Simon C. Groen
Xiaoguang Dai
Priyesh Rughani
Sophie Zaaijer
Eoghan D. Harrington
Sissel Juul
Michael D. Purugganan
author_facet Jae Young Choi
Zoe N. Lye
Simon C. Groen
Xiaoguang Dai
Priyesh Rughani
Sophie Zaaijer
Eoghan D. Harrington
Sissel Juul
Michael D. Purugganan
author_sort Jae Young Choi
collection DOAJ
description Abstract Background The circum-basmati group of cultivated Asian rice (Oryza sativa) contains many iconic varieties and is widespread in the Indian subcontinent. Despite its economic and cultural importance, a high-quality reference genome is currently lacking, and the group’s evolutionary history is not fully resolved. To address these gaps, we use long-read nanopore sequencing and assemble the genomes of two circum-basmati rice varieties. Results We generate two high-quality, chromosome-level reference genomes that represent the 12 chromosomes of Oryza. The assemblies show a contig N50 of 6.32 Mb and 10.53 Mb for Basmati 334 and Dom Sufid, respectively. Using our highly contiguous assemblies, we characterize structural variations segregating across circum-basmati genomes. We discover repeat expansions not observed in japonica—the rice group most closely related to circum-basmati—as well as the presence and absence variants of over 20 Mb, one of which is a circum-basmati-specific deletion of a gene regulating awn length. We further detect strong evidence of admixture between the circum-basmati and circum-aus groups. This gene flow has its greatest effect on chromosome 10, causing both structural variation and single-nucleotide polymorphism to deviate from genome-wide history. Lastly, population genomic analysis of 78 circum-basmati varieties shows three major geographically structured genetic groups: Bhutan/Nepal, India/Bangladesh/Myanmar, and Iran/Pakistan. Conclusion The availability of high-quality reference genomes allows functional and evolutionary genomic analyses providing genome-wide evidence for gene flow between circum-aus and circum-basmati, describes the nature of circum-basmati structural variation, and reveals the presence/absence variation in this important and iconic rice variety group.
first_indexed 2024-12-14T23:18:21Z
format Article
id doaj.art-684ef4ff48e74bbeb5abee60a299b41a
institution Directory Open Access Journal
issn 1474-760X
language English
last_indexed 2024-12-14T23:18:21Z
publishDate 2020-02-01
publisher BMC
record_format Article
series Genome Biology
spelling doaj.art-684ef4ff48e74bbeb5abee60a299b41a2022-12-21T22:44:01ZengBMCGenome Biology1474-760X2020-02-0121112710.1186/s13059-020-1938-2Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati riceJae Young Choi0Zoe N. Lye1Simon C. Groen2Xiaoguang Dai3Priyesh Rughani4Sophie Zaaijer5Eoghan D. Harrington6Sissel Juul7Michael D. Purugganan8Center for Genomics and Systems Biology, Department of Biology, New York UniversityCenter for Genomics and Systems Biology, Department of Biology, New York UniversityCenter for Genomics and Systems Biology, Department of Biology, New York UniversityOxford Nanopore TechnologiesOxford Nanopore TechnologiesNew York Genome CenterOxford Nanopore TechnologiesOxford Nanopore TechnologiesCenter for Genomics and Systems Biology, Department of Biology, New York UniversityAbstract Background The circum-basmati group of cultivated Asian rice (Oryza sativa) contains many iconic varieties and is widespread in the Indian subcontinent. Despite its economic and cultural importance, a high-quality reference genome is currently lacking, and the group’s evolutionary history is not fully resolved. To address these gaps, we use long-read nanopore sequencing and assemble the genomes of two circum-basmati rice varieties. Results We generate two high-quality, chromosome-level reference genomes that represent the 12 chromosomes of Oryza. The assemblies show a contig N50 of 6.32 Mb and 10.53 Mb for Basmati 334 and Dom Sufid, respectively. Using our highly contiguous assemblies, we characterize structural variations segregating across circum-basmati genomes. We discover repeat expansions not observed in japonica—the rice group most closely related to circum-basmati—as well as the presence and absence variants of over 20 Mb, one of which is a circum-basmati-specific deletion of a gene regulating awn length. We further detect strong evidence of admixture between the circum-basmati and circum-aus groups. This gene flow has its greatest effect on chromosome 10, causing both structural variation and single-nucleotide polymorphism to deviate from genome-wide history. Lastly, population genomic analysis of 78 circum-basmati varieties shows three major geographically structured genetic groups: Bhutan/Nepal, India/Bangladesh/Myanmar, and Iran/Pakistan. Conclusion The availability of high-quality reference genomes allows functional and evolutionary genomic analyses providing genome-wide evidence for gene flow between circum-aus and circum-basmati, describes the nature of circum-basmati structural variation, and reveals the presence/absence variation in this important and iconic rice variety group.https://doi.org/10.1186/s13059-020-1938-2Oryza sativaAsian riceAromatic rice groupDomesticationCrop evolutionNanopore sequencing
spellingShingle Jae Young Choi
Zoe N. Lye
Simon C. Groen
Xiaoguang Dai
Priyesh Rughani
Sophie Zaaijer
Eoghan D. Harrington
Sissel Juul
Michael D. Purugganan
Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
Genome Biology
Oryza sativa
Asian rice
Aromatic rice group
Domestication
Crop evolution
Nanopore sequencing
title Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
title_full Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
title_fullStr Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
title_full_unstemmed Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
title_short Nanopore sequencing-based genome assembly and evolutionary genomics of circum-basmati rice
title_sort nanopore sequencing based genome assembly and evolutionary genomics of circum basmati rice
topic Oryza sativa
Asian rice
Aromatic rice group
Domestication
Crop evolution
Nanopore sequencing
url https://doi.org/10.1186/s13059-020-1938-2
work_keys_str_mv AT jaeyoungchoi nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT zoenlye nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT simoncgroen nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT xiaoguangdai nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT priyeshrughani nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT sophiezaaijer nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT eoghandharrington nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT sisseljuul nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice
AT michaeldpurugganan nanoporesequencingbasedgenomeassemblyandevolutionarygenomicsofcircumbasmatirice