Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity
Aim of study: Regarding the third largest staple food crop in the world, determining the factors affecting wheat yield is of great importance. This study aimed to determine useful subsets of agronomic traits and evaluate the order of importance of traits in grain yield. Area of study: Fars provi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria
2023-02-01
|
Series: | Spanish Journal of Agricultural Research |
Subjects: | |
Online Access: | https://revistas.inia.es/index.php/sjar/article/view/19835 |
_version_ | 1811158130796527616 |
---|---|
author | Ali BEHPOURI Sara FAROKHZADEH Zahra ZINATI Zobeir KHOSRAVI |
author_facet | Ali BEHPOURI Sara FAROKHZADEH Zahra ZINATI Zobeir KHOSRAVI |
author_sort | Ali BEHPOURI |
collection | DOAJ |
description |
Aim of study: Regarding the third largest staple food crop in the world, determining the factors affecting wheat yield is of great importance. This study aimed to determine useful subsets of agronomic traits and evaluate the order of importance of traits in grain yield.
Area of study: Fars province, Iran.
Material and methods: In total, the data corresponding to 22 agronomic traits was collected from six different regions (Darab, Kavar, Marvdasht, Fasa, Lar, and Khonj) of 90 farms of Fars province, Iran as the most important wheat-growing regions. Multivariate statistical analysis (correlation, stepwise regression, and principal component analysis (PCA)) and machine learning modeling approaches, such as partial least squares regression (PLSR) and support vector regression (SVR) models, were applied to agronomic traits.
Main results: The findings, based on integrated approaches such as correlation, stepwise regression, and PCA, highlighted that number of spikes m-2, grain number spike-1, and thousand-grain weight had a major impact on the yield followed by awn length, spike length, narrow leaf herbicide, broadleaf herbicide, time to plant maturity (month), and soil salinity. Besides, PLSR with nine inputs (nine selected traits) displayed better prediction capability (R2=85 %, RMSE=0.32, MSE=0.10, and BIAS=-0.05) than that with all twenty-two input traits.
Research highlights: Integrated multivariate statistical analyses and machine learning regression methods could be a powerful tool in determining traits that have a significant impact on yield. These achievements can be considered for future breeding programs.
|
first_indexed | 2024-04-10T05:18:09Z |
format | Article |
id | doaj.art-6d1912dc9eaa4e3c82a34db1a66688ba |
institution | Directory Open Access Journal |
issn | 2171-9292 |
language | English |
last_indexed | 2024-04-10T05:18:09Z |
publishDate | 2023-02-01 |
publisher | Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria |
record_format | Article |
series | Spanish Journal of Agricultural Research |
spelling | doaj.art-6d1912dc9eaa4e3c82a34db1a66688ba2023-03-08T13:03:03ZengInstituto Nacional de Investigación y Tecnología Agraria y AlimentariaSpanish Journal of Agricultural Research2171-92922023-02-0121110.5424/sjar/2023211-19835Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity Ali BEHPOURI0Sara FAROKHZADEH1Zahra ZINATI2Zobeir KHOSRAVI3Department of Agroecology, College of Agriculture and Natural Resources of Darab, Shiraz University, IranDepartment of Agroecology, College of Agriculture and Natural Resources of Darab, Shiraz University, IranDepartment of Agroecology, College of Agriculture and Natural Resources of Darab, Shiraz University, IranDepartment of Agroecology, College of Agriculture and Natural Resources of Darab, Shiraz University, Iran Aim of study: Regarding the third largest staple food crop in the world, determining the factors affecting wheat yield is of great importance. This study aimed to determine useful subsets of agronomic traits and evaluate the order of importance of traits in grain yield. Area of study: Fars province, Iran. Material and methods: In total, the data corresponding to 22 agronomic traits was collected from six different regions (Darab, Kavar, Marvdasht, Fasa, Lar, and Khonj) of 90 farms of Fars province, Iran as the most important wheat-growing regions. Multivariate statistical analysis (correlation, stepwise regression, and principal component analysis (PCA)) and machine learning modeling approaches, such as partial least squares regression (PLSR) and support vector regression (SVR) models, were applied to agronomic traits. Main results: The findings, based on integrated approaches such as correlation, stepwise regression, and PCA, highlighted that number of spikes m-2, grain number spike-1, and thousand-grain weight had a major impact on the yield followed by awn length, spike length, narrow leaf herbicide, broadleaf herbicide, time to plant maturity (month), and soil salinity. Besides, PLSR with nine inputs (nine selected traits) displayed better prediction capability (R2=85 %, RMSE=0.32, MSE=0.10, and BIAS=-0.05) than that with all twenty-two input traits. Research highlights: Integrated multivariate statistical analyses and machine learning regression methods could be a powerful tool in determining traits that have a significant impact on yield. These achievements can be considered for future breeding programs. https://revistas.inia.es/index.php/sjar/article/view/19835Triticum aestivummultivariate statistical analysispartial least squares regressionsupport vector regression |
spellingShingle | Ali BEHPOURI Sara FAROKHZADEH Zahra ZINATI Zobeir KHOSRAVI Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity Spanish Journal of Agricultural Research Triticum aestivum multivariate statistical analysis partial least squares regression support vector regression |
title | Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
title_full | Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
title_fullStr | Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
title_full_unstemmed | Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
title_short | Use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
title_sort | use of multivariate analysis and machine learning methods to characterize traits contributing to wheat yield diversity |
topic | Triticum aestivum multivariate statistical analysis partial least squares regression support vector regression |
url | https://revistas.inia.es/index.php/sjar/article/view/19835 |
work_keys_str_mv | AT alibehpouri useofmultivariateanalysisandmachinelearningmethodstocharacterizetraitscontributingtowheatyielddiversity AT sarafarokhzadeh useofmultivariateanalysisandmachinelearningmethodstocharacterizetraitscontributingtowheatyielddiversity AT zahrazinati useofmultivariateanalysisandmachinelearningmethodstocharacterizetraitscontributingtowheatyielddiversity AT zobeirkhosravi useofmultivariateanalysisandmachinelearningmethodstocharacterizetraitscontributingtowheatyielddiversity |