Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms
Peanut is a critical food crop worldwide, and the development of high-throughput phenotyping techniques is essential for enhancing the crop’s genetic gain rate. Given the obvious challenges of directly estimating peanut yields through remote sensing, an approach that utilizes above-ground phenotypes...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2024-02-01
|
Series: | Frontiers in Plant Science |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fpls.2024.1339864/full |
_version_ | 1827346845969416192 |
---|---|
author | N. Ace Pugh Andrew Young Manisha Ojha Yves Emendack Jacobo Sanchez Zhanguo Xin Naveen Puppala |
author_facet | N. Ace Pugh Andrew Young Manisha Ojha Yves Emendack Jacobo Sanchez Zhanguo Xin Naveen Puppala |
author_sort | N. Ace Pugh |
collection | DOAJ |
description | Peanut is a critical food crop worldwide, and the development of high-throughput phenotyping techniques is essential for enhancing the crop’s genetic gain rate. Given the obvious challenges of directly estimating peanut yields through remote sensing, an approach that utilizes above-ground phenotypes to estimate underground yield is necessary. To that end, this study leveraged unmanned aerial vehicles (UAVs) for high-throughput phenotyping of surface traits in peanut. Using a diverse set of peanut germplasm planted in 2021 and 2022, UAV flight missions were repeatedly conducted to capture image data that were used to construct high-resolution multitemporal sigmoidal growth curves based on apparent characteristics, such as canopy cover and canopy height. Latent phenotypes extracted from these growth curves and their first derivatives informed the development of advanced machine learning models, specifically random forest and eXtreme Gradient Boosting (XGBoost), to estimate yield in the peanut plots. The random forest model exhibited exceptional predictive accuracy (R2 = 0.93), while XGBoost was also reasonably effective (R2 = 0.88). When using confusion matrices to evaluate the classification abilities of each model, the two models proved valuable in a breeding pipeline, particularly for filtering out underperforming genotypes. In addition, the random forest model excelled in identifying top-performing material while minimizing Type I and Type II errors. Overall, these findings underscore the potential of machine learning models, especially random forests and XGBoost, in predicting peanut yield and improving the efficiency of peanut breeding programs. |
first_indexed | 2024-03-07T23:38:36Z |
format | Article |
id | doaj.art-b496b5e132574d47851fb14d52014ab2 |
institution | Directory Open Access Journal |
issn | 1664-462X |
language | English |
last_indexed | 2024-03-07T23:38:36Z |
publishDate | 2024-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Plant Science |
spelling | doaj.art-b496b5e132574d47851fb14d52014ab22024-02-20T04:29:20ZengFrontiers Media S.A.Frontiers in Plant Science1664-462X2024-02-011510.3389/fpls.2024.13398641339864Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithmsN. Ace Pugh0Andrew Young1Manisha Ojha2Yves Emendack3Jacobo Sanchez4Zhanguo Xin5Naveen Puppala6United States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United StatesUnited States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United StatesAgricultural Science Center at Clovis, New Mexico State University, Clovis, NM, United StatesUnited States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United StatesUnited States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United StatesUnited States Department of Agriculture, Crop Stress Research Laboratory, Lubbock, TX, United StatesAgricultural Science Center at Clovis, New Mexico State University, Clovis, NM, United StatesPeanut is a critical food crop worldwide, and the development of high-throughput phenotyping techniques is essential for enhancing the crop’s genetic gain rate. Given the obvious challenges of directly estimating peanut yields through remote sensing, an approach that utilizes above-ground phenotypes to estimate underground yield is necessary. To that end, this study leveraged unmanned aerial vehicles (UAVs) for high-throughput phenotyping of surface traits in peanut. Using a diverse set of peanut germplasm planted in 2021 and 2022, UAV flight missions were repeatedly conducted to capture image data that were used to construct high-resolution multitemporal sigmoidal growth curves based on apparent characteristics, such as canopy cover and canopy height. Latent phenotypes extracted from these growth curves and their first derivatives informed the development of advanced machine learning models, specifically random forest and eXtreme Gradient Boosting (XGBoost), to estimate yield in the peanut plots. The random forest model exhibited exceptional predictive accuracy (R2 = 0.93), while XGBoost was also reasonably effective (R2 = 0.88). When using confusion matrices to evaluate the classification abilities of each model, the two models proved valuable in a breeding pipeline, particularly for filtering out underperforming genotypes. In addition, the random forest model excelled in identifying top-performing material while minimizing Type I and Type II errors. Overall, these findings underscore the potential of machine learning models, especially random forests and XGBoost, in predicting peanut yield and improving the efficiency of peanut breeding programs.https://www.frontiersin.org/articles/10.3389/fpls.2024.1339864/fullartificial intelligencecrop yieldgrowth curvesmachine learningpeanutplant breeding |
spellingShingle | N. Ace Pugh Andrew Young Manisha Ojha Yves Emendack Jacobo Sanchez Zhanguo Xin Naveen Puppala Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms Frontiers in Plant Science artificial intelligence crop yield growth curves machine learning peanut plant breeding |
title | Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
title_full | Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
title_fullStr | Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
title_full_unstemmed | Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
title_short | Yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
title_sort | yield prediction in a peanut breeding program using remote sensing data and machine learning algorithms |
topic | artificial intelligence crop yield growth curves machine learning peanut plant breeding |
url | https://www.frontiersin.org/articles/10.3389/fpls.2024.1339864/full |
work_keys_str_mv | AT nacepugh yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT andrewyoung yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT manishaojha yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT yvesemendack yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT jacobosanchez yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT zhanguoxin yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms AT naveenpuppala yieldpredictioninapeanutbreedingprogramusingremotesensingdataandmachinelearningalgorithms |