Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.

Improving genetic yield potential in major food grade crops such as soybean (Glycine max L.) is the most sustainable way to address the growing global food demand and its security concerns. Yield is a complex trait and reliant on various related variables called yield components. In this study, the...

Full description

Bibliographic Details
Main Authors: Mohsen Yoosefzadeh-Najafabadi, Dan Tulpan, Milad Eskandari
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0250665
_version_ 1818824956437331968
author Mohsen Yoosefzadeh-Najafabadi
Dan Tulpan
Milad Eskandari
author_facet Mohsen Yoosefzadeh-Najafabadi
Dan Tulpan
Milad Eskandari
author_sort Mohsen Yoosefzadeh-Najafabadi
collection DOAJ
description Improving genetic yield potential in major food grade crops such as soybean (Glycine max L.) is the most sustainable way to address the growing global food demand and its security concerns. Yield is a complex trait and reliant on various related variables called yield components. In this study, the five most important yield component traits in soybean were measured using a panel of 250 genotypes grown in four environments. These traits were the number of nodes per plant (NP), number of non-reproductive nodes per plant (NRNP), number of reproductive nodes per plant (RNP), number of pods per plant (PP), and the ratio of number of pods to number of nodes per plant (P/N). These data were used for predicting the total soybean seed yield using the Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Random Forest (RF), machine learning (ML) algorithms, individually and collectively through an ensemble method based on bagging strategy (E-B). The RBF algorithm with highest Coefficient of Determination (R2) value of 0.81 and the lowest Mean Absolute Errors (MAE) and Root Mean Square Error (RMSE) values of 148.61 kg.ha-1, and 185.31 kg.ha-1, respectively, was the most accurate algorithm and, therefore, selected as the metaClassifier for the E-B algorithm. Using the E-B algorithm, we were able to increase the prediction accuracy by improving the values of R2, MAE, and RMSE by 0.1, 0.24 kg.ha-1, and 0.96 kg.ha-1, respectively. Furthermore, for the first time in this study, we allied the E-B with the genetic algorithm (GA) to model the optimum values of yield components in an ideotype genotype in which the yield is maximized. The results revealed a better understanding of the relationships between soybean yield and its components, which can be used for selecting parental lines and designing promising crosses for developing cultivars with improved genetic yield potential.
first_indexed 2024-12-19T00:04:06Z
format Article
id doaj.art-087c992fe3dc4497b07fce7766247322
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-19T00:04:06Z
publishDate 2021-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-087c992fe3dc4497b07fce77662473222022-12-21T20:46:18ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01164e025066510.1371/journal.pone.0250665Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.Mohsen Yoosefzadeh-NajafabadiDan TulpanMilad EskandariImproving genetic yield potential in major food grade crops such as soybean (Glycine max L.) is the most sustainable way to address the growing global food demand and its security concerns. Yield is a complex trait and reliant on various related variables called yield components. In this study, the five most important yield component traits in soybean were measured using a panel of 250 genotypes grown in four environments. These traits were the number of nodes per plant (NP), number of non-reproductive nodes per plant (NRNP), number of reproductive nodes per plant (RNP), number of pods per plant (PP), and the ratio of number of pods to number of nodes per plant (P/N). These data were used for predicting the total soybean seed yield using the Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Random Forest (RF), machine learning (ML) algorithms, individually and collectively through an ensemble method based on bagging strategy (E-B). The RBF algorithm with highest Coefficient of Determination (R2) value of 0.81 and the lowest Mean Absolute Errors (MAE) and Root Mean Square Error (RMSE) values of 148.61 kg.ha-1, and 185.31 kg.ha-1, respectively, was the most accurate algorithm and, therefore, selected as the metaClassifier for the E-B algorithm. Using the E-B algorithm, we were able to increase the prediction accuracy by improving the values of R2, MAE, and RMSE by 0.1, 0.24 kg.ha-1, and 0.96 kg.ha-1, respectively. Furthermore, for the first time in this study, we allied the E-B with the genetic algorithm (GA) to model the optimum values of yield components in an ideotype genotype in which the yield is maximized. The results revealed a better understanding of the relationships between soybean yield and its components, which can be used for selecting parental lines and designing promising crosses for developing cultivars with improved genetic yield potential.https://doi.org/10.1371/journal.pone.0250665
spellingShingle Mohsen Yoosefzadeh-Najafabadi
Dan Tulpan
Milad Eskandari
Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
PLoS ONE
title Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
title_full Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
title_fullStr Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
title_full_unstemmed Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
title_short Application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits.
title_sort application of machine learning and genetic optimization algorithms for modeling and optimizing soybean yield using its component traits
url https://doi.org/10.1371/journal.pone.0250665
work_keys_str_mv AT mohsenyoosefzadehnajafabadi applicationofmachinelearningandgeneticoptimizationalgorithmsformodelingandoptimizingsoybeanyieldusingitscomponenttraits
AT dantulpan applicationofmachinelearningandgeneticoptimizationalgorithmsformodelingandoptimizingsoybeanyieldusingitscomponenttraits
AT miladeskandari applicationofmachinelearningandgeneticoptimizationalgorithmsformodelingandoptimizingsoybeanyieldusingitscomponenttraits