Crop phenotype prediction using biclustering to explain genotype-by-environment interactions
Phenotypic variation in plants is attributed to genotype (G), environment (E), and genotype-by-environment interaction (GEI). Although the main effects of G and E are typically larger and easier to model, the GEI interaction effects are important and a critical factor when considering such issues as...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-09-01
|
Series: | Frontiers in Plant Science |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fpls.2022.975976/full |
_version_ | 1797995339912839168 |
---|---|
author | Hieu Pham John Reisner Ashley Swift Sigurdur Olafsson Stephen Vardeman Stephen Vardeman |
author_facet | Hieu Pham John Reisner Ashley Swift Sigurdur Olafsson Stephen Vardeman Stephen Vardeman |
author_sort | Hieu Pham |
collection | DOAJ |
description | Phenotypic variation in plants is attributed to genotype (G), environment (E), and genotype-by-environment interaction (GEI). Although the main effects of G and E are typically larger and easier to model, the GEI interaction effects are important and a critical factor when considering such issues as to why some genotypes perform consistently well across a range of environments. In plant breeding, a major challenge is limited information, including a single genotype is tested in only a small subset of all possible test environments. The two-way table of phenotype responses will therefore commonly contain missing data. In this paper, we propose a new model of GEI effects that only requires an input of a two-way table of phenotype observations, with genotypes as rows and environments as columns that do not assume the completeness of data. Our analysis can deal with this scenario as it utilizes a novel biclustering algorithm that can handle missing values, resulting in an output of homogeneous cells with no interactions between G and E. In other words, we identify subsets of genotypes and environments where phenotype can be modeled simply. Based on this, we fit no-interaction models to predict phenotypes of a given crop and draw insights into how a particular cultivar will perform in the unused test environments. Our new methodology is validated on data from different plant species and phenotypes and shows superior performance compared to well-studied statistical approaches. |
first_indexed | 2024-04-11T10:00:04Z |
format | Article |
id | doaj.art-e75636d51412482f93f4babb163310a1 |
institution | Directory Open Access Journal |
issn | 1664-462X |
language | English |
last_indexed | 2024-04-11T10:00:04Z |
publishDate | 2022-09-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Plant Science |
spelling | doaj.art-e75636d51412482f93f4babb163310a12022-12-22T04:30:27ZengFrontiers Media S.A.Frontiers in Plant Science1664-462X2022-09-011310.3389/fpls.2022.975976975976Crop phenotype prediction using biclustering to explain genotype-by-environment interactionsHieu Pham0John Reisner1Ashley Swift2Sigurdur Olafsson3Stephen Vardeman4Stephen Vardeman5Department of Information Systems, Supply Chain, and Analytics, College of Business, The University of Alabama in Huntsville, Huntsville, AL, United StatesDepartment of Statistics, Iowa State University, Ames, IA, United StatesDepartment of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United StatesDepartment of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United StatesDepartment of Statistics, Iowa State University, Ames, IA, United StatesDepartment of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United StatesPhenotypic variation in plants is attributed to genotype (G), environment (E), and genotype-by-environment interaction (GEI). Although the main effects of G and E are typically larger and easier to model, the GEI interaction effects are important and a critical factor when considering such issues as to why some genotypes perform consistently well across a range of environments. In plant breeding, a major challenge is limited information, including a single genotype is tested in only a small subset of all possible test environments. The two-way table of phenotype responses will therefore commonly contain missing data. In this paper, we propose a new model of GEI effects that only requires an input of a two-way table of phenotype observations, with genotypes as rows and environments as columns that do not assume the completeness of data. Our analysis can deal with this scenario as it utilizes a novel biclustering algorithm that can handle missing values, resulting in an output of homogeneous cells with no interactions between G and E. In other words, we identify subsets of genotypes and environments where phenotype can be modeled simply. Based on this, we fit no-interaction models to predict phenotypes of a given crop and draw insights into how a particular cultivar will perform in the unused test environments. Our new methodology is validated on data from different plant species and phenotypes and shows superior performance compared to well-studied statistical approaches.https://www.frontiersin.org/articles/10.3389/fpls.2022.975976/fulllinear modelno-interaction modelmissing dataunsupervised learningmachine learning |
spellingShingle | Hieu Pham John Reisner Ashley Swift Sigurdur Olafsson Stephen Vardeman Stephen Vardeman Crop phenotype prediction using biclustering to explain genotype-by-environment interactions Frontiers in Plant Science linear model no-interaction model missing data unsupervised learning machine learning |
title | Crop phenotype prediction using biclustering to explain genotype-by-environment interactions |
title_full | Crop phenotype prediction using biclustering to explain genotype-by-environment interactions |
title_fullStr | Crop phenotype prediction using biclustering to explain genotype-by-environment interactions |
title_full_unstemmed | Crop phenotype prediction using biclustering to explain genotype-by-environment interactions |
title_short | Crop phenotype prediction using biclustering to explain genotype-by-environment interactions |
title_sort | crop phenotype prediction using biclustering to explain genotype by environment interactions |
topic | linear model no-interaction model missing data unsupervised learning machine learning |
url | https://www.frontiersin.org/articles/10.3389/fpls.2022.975976/full |
work_keys_str_mv | AT hieupham cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions AT johnreisner cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions AT ashleyswift cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions AT sigurdurolafsson cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions AT stephenvardeman cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions AT stephenvardeman cropphenotypepredictionusingbiclusteringtoexplaingenotypebyenvironmentinteractions |