A Hybrid Approach to Tea Crop Yield Prediction Using Simulation Models and Machine Learning

Tea (<i>Camellia sinensis</i> L.) is one of the most highly consumed beverages globally after water. Several countries import large quantities of tea from other countries to meet domestic needs. Therefore, accurate and timely prediction of tea yield is critical. The previous studies used...

Full description

Bibliographic Details
Main Authors: Dania Batool, Muhammad Shahbaz, Hafiz Shahzad Asif, Kamran Shaukat, Talha Mahboob Alam, Ibrahim A. Hameed, Zeeshan Ramzan, Abdul Waheed, Hanan Aljuaid, Suhuai Luo
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Plants
Subjects:
Online Access:https://www.mdpi.com/2223-7747/11/15/1925
Description
Summary:Tea (<i>Camellia sinensis</i> L.) is one of the most highly consumed beverages globally after water. Several countries import large quantities of tea from other countries to meet domestic needs. Therefore, accurate and timely prediction of tea yield is critical. The previous studies used statistical, deep learning, and machine learning techniques for tea yield prediction, but crop simulation models have not yet been used. However, the calibration of a simulation model for tea yield prediction and the comparison of these approaches is needed regarding the different data types. This research study aims to provide a comparative study of the methods for tea yield prediction using the Food and Agriculture Organization (FAO) of the United Nations AquaCrop simulation model and machine learning techniques. We employed weather, soil, crop, and agro-management data from 2016 to 2019 acquired from tea fields of the National Tea and High-Value Crop Research Institute (NTHRI), Pakistan, to calibrate the AquaCrop simulation model and to train regression algorithms. We achieved a mean absolute error (<i>MAE</i>) of 0.45 t/ha, a mean squared error (<i>MSE</i>) of 0.23 t/ha, and a root mean square error (<i>RMSE</i>) of 0.48 t/ha in the calibration of the AquaCrop model and, out of the ten regression models, we achieved the lowest <i>MAE</i> of 0.093 t/ha, <i>MSE</i> of 0.015 t/ha, and <i>RMSE</i> of 0.120 t/ha using 10-fold cross-validation and <i>MAE</i> of 0.123 t/ha, <i>MSE</i> of 0.024 t/ha, and <i>RMSE</i> of 0.154 t/ha using the XGBoost regressor with train test split. We concluded that the machine learning regression algorithm performed better in yield prediction using fewer data than the simulation model. This study provides a technique to improve tea yield prediction by combining different data sources using a crop simulation model and machine learning algorithms.
ISSN:2223-7747