GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield

The expression of quantitative traits of a line of a crop depends on its genetics, the environment where it is sown and the interaction between the genetic information and the environment known as GxE. Thus to maximize food production, new varieties are developed by selecting superior lines of seeds...

Full description

Bibliographic Details
Main Authors: Sheikh Jubair, Olivier Tremblay-Savard, Mike Domaratzki
Format: Article
Language:English
Published: KeAi Communications Co., Ltd. 2023-06-01
Series:Artificial Intelligence in Agriculture
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2589721723000168
_version_ 1797791905828831232
author Sheikh Jubair
Olivier Tremblay-Savard
Mike Domaratzki
author_facet Sheikh Jubair
Olivier Tremblay-Savard
Mike Domaratzki
author_sort Sheikh Jubair
collection DOAJ
description The expression of quantitative traits of a line of a crop depends on its genetics, the environment where it is sown and the interaction between the genetic information and the environment known as GxE. Thus to maximize food production, new varieties are developed by selecting superior lines of seeds suitable for a specific environment. Genomic selection is a computational technique for developing a new variety that uses whole genome molecular markers to identify top lines of a crop. A large number of statistical and machine learning models are employed for single environment trials, where it is assumed that the environment does not have any effect on the quantitative traits. However, it is essential to consider both genomic and environmental data to develop a new variety, as these strong assumptions may lead to failing to select top lines for an environment. Here we devised three novel deep learning frameworks incorporating GxE within the deep learning model and predicted line-specific yield for an environment. In the process, we also developed a new technique for identifying environment-specific markers that can be useful in many applications of environment-specific genomic selection. The result demonstrates that our best framework obtains 1.75 to 1.95 times better correlation coefficients than other deep learning models that incorporate environmental data depending on the test scenario. Furthermore, the feature importance analysis shows that environmental information, followed by genomic information, is the driving factor in predicting environment-specific yield for a line. We also demonstrate a way to extend our framework for new data types, such as text or soil data. The extended model also shows the potential to be useful in genomic selection.
first_indexed 2024-03-13T02:25:28Z
format Article
id doaj.art-c5ba390e14ef4fa3b0f7d8870ffae016
institution Directory Open Access Journal
issn 2589-7217
language English
last_indexed 2024-03-13T02:25:28Z
publishDate 2023-06-01
publisher KeAi Communications Co., Ltd.
record_format Article
series Artificial Intelligence in Agriculture
spelling doaj.art-c5ba390e14ef4fa3b0f7d8870ffae0162023-06-30T04:22:40ZengKeAi Communications Co., Ltd.Artificial Intelligence in Agriculture2589-72172023-06-0186076GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yieldSheikh Jubair0Olivier Tremblay-Savard1Mike Domaratzki2Department of Computer Science, University of Manitoba, 66 Chancellors Cir, Winnipeg, MB, R3T 2N2, Canada; Corresponding author.Department of Computer Science, University of Manitoba, 66 Chancellors Cir, Winnipeg, MB, R3T 2N2, CanadaDepartment of Computer Science, University of Western Ontario, 1151 Richmond St, London, ON, N6A 3K7, CanadaThe expression of quantitative traits of a line of a crop depends on its genetics, the environment where it is sown and the interaction between the genetic information and the environment known as GxE. Thus to maximize food production, new varieties are developed by selecting superior lines of seeds suitable for a specific environment. Genomic selection is a computational technique for developing a new variety that uses whole genome molecular markers to identify top lines of a crop. A large number of statistical and machine learning models are employed for single environment trials, where it is assumed that the environment does not have any effect on the quantitative traits. However, it is essential to consider both genomic and environmental data to develop a new variety, as these strong assumptions may lead to failing to select top lines for an environment. Here we devised three novel deep learning frameworks incorporating GxE within the deep learning model and predicted line-specific yield for an environment. In the process, we also developed a new technique for identifying environment-specific markers that can be useful in many applications of environment-specific genomic selection. The result demonstrates that our best framework obtains 1.75 to 1.95 times better correlation coefficients than other deep learning models that incorporate environmental data depending on the test scenario. Furthermore, the feature importance analysis shows that environmental information, followed by genomic information, is the driving factor in predicting environment-specific yield for a line. We also demonstrate a way to extend our framework for new data types, such as text or soil data. The extended model also shows the potential to be useful in genomic selection.http://www.sciencedirect.com/science/article/pii/S2589721723000168Genomic predictionMulti-environment trialDeep learningGxEEnviromics
spellingShingle Sheikh Jubair
Olivier Tremblay-Savard
Mike Domaratzki
GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
Artificial Intelligence in Agriculture
Genomic prediction
Multi-environment trial
Deep learning
GxE
Enviromics
title GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
title_full GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
title_fullStr GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
title_full_unstemmed GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
title_short GxENet: Novel fully connected neural network based approaches to incorporate GxE for predicting wheat yield
title_sort gxenet novel fully connected neural network based approaches to incorporate gxe for predicting wheat yield
topic Genomic prediction
Multi-environment trial
Deep learning
GxE
Enviromics
url http://www.sciencedirect.com/science/article/pii/S2589721723000168
work_keys_str_mv AT sheikhjubair gxenetnovelfullyconnectedneuralnetworkbasedapproachestoincorporategxeforpredictingwheatyield
AT oliviertremblaysavard gxenetnovelfullyconnectedneuralnetworkbasedapproachestoincorporategxeforpredictingwheatyield
AT mikedomaratzki gxenetnovelfullyconnectedneuralnetworkbasedapproachestoincorporategxeforpredictingwheatyield