Integration of eQTL and machine learning to dissect causal genes with pleiotropic effects in genetic regulation networks of seed cotton yield

Summary: The dissection of a gene regulatory network (GRN) that complements the genome-wide association study (GWAS) locus and the crosstalk underlying multiple agronomical traits remains a major challenge. In this study, we generate 558 transcriptional profiles of lint-bearing ovules at one day pos...

Full description

Bibliographic Details
Main Authors: Ting Zhao, Hongyu Wu, Xutong Wang, Yongyan Zhao, Luyao Wang, Jiaying Pan, Huan Mei, Jin Han, Siyuan Wang, Kening Lu, Menglin Li, Mengtao Gao, Zeyi Cao, Hailin Zhang, Ke Wan, Jie Li, Lei Fang, Tianzhen Zhang, Xueying Guan
Format: Article
Language:English
Published: Elsevier 2023-09-01
Series:Cell Reports
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2211124723011221
Description
Summary:Summary: The dissection of a gene regulatory network (GRN) that complements the genome-wide association study (GWAS) locus and the crosstalk underlying multiple agronomical traits remains a major challenge. In this study, we generate 558 transcriptional profiles of lint-bearing ovules at one day post-anthesis from a selective core cotton germplasm, from which 12,207 expression quantitative trait loci (eQTLs) are identified. Sixty-six known phenotypic GWAS loci are colocalized with 1,090 eQTLs, forming 38 functional GRNs associated predominantly with seed yield. Of the eGenes, 34 exhibit pleiotropic effects. Combining the eQTLs within the seed yield GRNs significantly increases the portion of narrow-sense heritability. The extreme gradient boosting (XGBoost) machine learning approach is applied to predict seed cotton yield phenotypes on the basis of gene expression. Top-ranking eGenes (NF-YB3, FLA2, and GRDP1) derived with pleiotropic effects on yield traits are validated, along with their potential roles by correlation analysis, domestication selection analysis, and transgenic plants.
ISSN:2211-1247