An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean

Soybean is sensitive to low temperatures during the crop growing season. An urgent demand for breeding cold-tolerant cultivars to alleviate the production loss is apparent to cope with this scenario. Cold-tolerant trait is a complex and quantitative trait controlled by multiple genes, environmental...

Full description

Bibliographic Details
Main Authors: Pei-Hsiu Kao, Supaporn Baiya, Zheng-Yuan Lai, Chih-Min Huang, Li-Hsin Jhan, Chian-Jiun Lin, Ya-Syuan Lai, Chung-Feng Kao
Format: Article
Language:English
Published: Frontiers Media S.A. 2022-09-01
Series:Frontiers in Plant Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpls.2022.1019709/full
_version_ 1818030751287869440
author Pei-Hsiu Kao
Supaporn Baiya
Zheng-Yuan Lai
Chih-Min Huang
Li-Hsin Jhan
Chian-Jiun Lin
Ya-Syuan Lai
Chung-Feng Kao
Chung-Feng Kao
author_facet Pei-Hsiu Kao
Supaporn Baiya
Zheng-Yuan Lai
Chih-Min Huang
Li-Hsin Jhan
Chian-Jiun Lin
Ya-Syuan Lai
Chung-Feng Kao
Chung-Feng Kao
author_sort Pei-Hsiu Kao
collection DOAJ
description Soybean is sensitive to low temperatures during the crop growing season. An urgent demand for breeding cold-tolerant cultivars to alleviate the production loss is apparent to cope with this scenario. Cold-tolerant trait is a complex and quantitative trait controlled by multiple genes, environmental factors, and their interaction. In this study, we proposed an advanced systems biology framework of feature engineering for the discovery of cold tolerance genes (CTgenes) from integrated omics and non-omics (OnO) data in soybean. An integrative pipeline was introduced for feature selection and feature extraction from different layers in the integrated OnO data using data ensemble methods and the non-parameter random forest prioritization to minimize uncertainties and false positives for accuracy improvement of results. In total, 44, 143, and 45 CTgenes were identified in short-, mid-, and long-term cold treatment, respectively, from the corresponding gene-pool. These CTgenes outperformed the remaining genes, the random genes, and the other candidate genes identified by other approaches in an independent RNA-seq database. Furthermore, we applied pathway enrichment and crosstalk network analyses to uncover relevant physiological pathways with the discovery of underlying cold tolerance in hormone- and defense-related modules. Our CTgenes were validated by using 55 SNP genotype data of 56 soybean samples in cold tolerance experiments. This suggests that the CTgenes identified from our proposed systematic framework can effectively distinguish cold-resistant and cold-sensitive lines. It is an important advancement in the soybean cold-stress response. The proposed pipelines provide an alternative solution to biomarker discovery, module discovery, and sample classification underlying a particular trait in plants in a robust and efficient way.
first_indexed 2024-12-10T05:40:33Z
format Article
id doaj.art-dddcd3908b074ea2833254ccd5dfff77
institution Directory Open Access Journal
issn 1664-462X
language English
last_indexed 2024-12-10T05:40:33Z
publishDate 2022-09-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Plant Science
spelling doaj.art-dddcd3908b074ea2833254ccd5dfff772022-12-22T02:00:19ZengFrontiers Media S.A.Frontiers in Plant Science1664-462X2022-09-011310.3389/fpls.2022.10197091019709An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybeanPei-Hsiu Kao0Supaporn Baiya1Zheng-Yuan Lai2Chih-Min Huang3Li-Hsin Jhan4Chian-Jiun Lin5Ya-Syuan Lai6Chung-Feng Kao7Chung-Feng Kao8Department of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Resource and Environment, Faculty of Science at Sriracha, Kasetsart University, Sriracha, ThailandDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanDepartment of Agronomy, College of Agriculture and Natural Resources, National Chung Hsing University, Taichung, TaiwanAdvanced Plant Biotechnology Center, National Chung Hsing University, Taichung, TaiwanSoybean is sensitive to low temperatures during the crop growing season. An urgent demand for breeding cold-tolerant cultivars to alleviate the production loss is apparent to cope with this scenario. Cold-tolerant trait is a complex and quantitative trait controlled by multiple genes, environmental factors, and their interaction. In this study, we proposed an advanced systems biology framework of feature engineering for the discovery of cold tolerance genes (CTgenes) from integrated omics and non-omics (OnO) data in soybean. An integrative pipeline was introduced for feature selection and feature extraction from different layers in the integrated OnO data using data ensemble methods and the non-parameter random forest prioritization to minimize uncertainties and false positives for accuracy improvement of results. In total, 44, 143, and 45 CTgenes were identified in short-, mid-, and long-term cold treatment, respectively, from the corresponding gene-pool. These CTgenes outperformed the remaining genes, the random genes, and the other candidate genes identified by other approaches in an independent RNA-seq database. Furthermore, we applied pathway enrichment and crosstalk network analyses to uncover relevant physiological pathways with the discovery of underlying cold tolerance in hormone- and defense-related modules. Our CTgenes were validated by using 55 SNP genotype data of 56 soybean samples in cold tolerance experiments. This suggests that the CTgenes identified from our proposed systematic framework can effectively distinguish cold-resistant and cold-sensitive lines. It is an important advancement in the soybean cold-stress response. The proposed pipelines provide an alternative solution to biomarker discovery, module discovery, and sample classification underlying a particular trait in plants in a robust and efficient way.https://www.frontiersin.org/articles/10.3389/fpls.2022.1019709/fullsoybeancold tolerancefeature engineeringomics and non-omics data integrationsystems biologynon-parameter random forest prioritization
spellingShingle Pei-Hsiu Kao
Supaporn Baiya
Zheng-Yuan Lai
Chih-Min Huang
Li-Hsin Jhan
Chian-Jiun Lin
Ya-Syuan Lai
Chung-Feng Kao
Chung-Feng Kao
An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
Frontiers in Plant Science
soybean
cold tolerance
feature engineering
omics and non-omics data integration
systems biology
non-parameter random forest prioritization
title An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
title_full An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
title_fullStr An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
title_full_unstemmed An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
title_short An advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non-omics data in soybean
title_sort advanced systems biology framework of feature engineering for cold tolerance genes discovery from integrated omics and non omics data in soybean
topic soybean
cold tolerance
feature engineering
omics and non-omics data integration
systems biology
non-parameter random forest prioritization
url https://www.frontiersin.org/articles/10.3389/fpls.2022.1019709/full
work_keys_str_mv AT peihsiukao anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT supapornbaiya anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT zhengyuanlai anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chihminhuang anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT lihsinjhan anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chianjiunlin anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT yasyuanlai anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chungfengkao anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chungfengkao anadvancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT peihsiukao advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT supapornbaiya advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT zhengyuanlai advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chihminhuang advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT lihsinjhan advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chianjiunlin advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT yasyuanlai advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chungfengkao advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean
AT chungfengkao advancedsystemsbiologyframeworkoffeatureengineeringforcoldtolerancegenesdiscoveryfromintegratedomicsandnonomicsdatainsoybean