Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia

The identification of influencing factors in crop yield (kg·ha-1) provides essential information for decision-making processes related to the prediction and improvement of productivity, which gives farmers the opportunity to increase their income. The current study investigates the application of mu...

Full description

Bibliographic Details
Main Authors: Henry Lamos-Díaz, Ph. D., David Esteban Puentes-Garzón, M.Sc., Diego Alejandro Zarate-Caicedo, Ph. D.
Format: Article
Language:English
Published: Universidad Pedagógica y Tecnológica de Colombia 2020-05-01
Series:Revista Facultad de Ingeniería
Subjects:
Online Access:https://revistas.uptc.edu.co/index.php/ingenieria/article/view/10853
_version_ 1819086596673110016
author Henry Lamos-Díaz, Ph. D.
David Esteban Puentes-Garzón, M.Sc.
Diego Alejandro Zarate-Caicedo, Ph. D.
author_facet Henry Lamos-Díaz, Ph. D.
David Esteban Puentes-Garzón, M.Sc.
Diego Alejandro Zarate-Caicedo, Ph. D.
author_sort Henry Lamos-Díaz, Ph. D.
collection DOAJ
description The identification of influencing factors in crop yield (kg·ha-1) provides essential information for decision-making processes related to the prediction and improvement of productivity, which gives farmers the opportunity to increase their income. The current study investigates the application of multiple machine learning algorithms for cocoa yield prediction and influencing factors identification. The Support Vector Machines (SVM) and Ensemble Learning Models (Random Forests, Gradient Boosting) are compared with Least Absolute Shrinkage and Selection Operator (LASSO) regression models. The considered predictors were climate conditions, cocoa variety, fertilization level and sun exposition in an experimental crop located in Rionegro, Santander. Results showed that Gradient Boosting is the best prediction alternative with Coefficient of determination (R2) = 68%, Mean Absolute Error (MAE) = 13.32, and Root Mean Square Error (RMSE) = 20.41. The crop yield variability is explained mainly by the radiation one month before harvest, the accumulated rainfall on the harvest month, and the temperature one month before harvest. Likewise, the crop yields are evaluated based on the kind of sun exposure, and it was found that radiation one month before harvest is the most influential factor in shade-grown plants. On the other hand, rainfall and soil moisture are determining variables in sun-grown plants, which is associated with the water requirements. These results suggest a differentiated management for crops depending on the kind of sun exposure to avoid compromising productivity, since there is no significant difference in the yield of both agricultural managements.
first_indexed 2024-12-21T21:22:46Z
format Article
id doaj.art-af4a7884f3454464a738d2e260e6f4fa
institution Directory Open Access Journal
issn 0121-1129
2357-5328
language English
last_indexed 2024-12-21T21:22:46Z
publishDate 2020-05-01
publisher Universidad Pedagógica y Tecnológica de Colombia
record_format Article
series Revista Facultad de Ingeniería
spelling doaj.art-af4a7884f3454464a738d2e260e6f4fa2022-12-21T18:49:50ZengUniversidad Pedagógica y Tecnológica de ColombiaRevista Facultad de Ingeniería0121-11292357-53282020-05-012954e10853e1085310.19053/01211129.v29.n54.2020.1085310853Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, ColombiaHenry Lamos-Díaz, Ph. D.0David Esteban Puentes-Garzón, M.Sc.1Diego Alejandro Zarate-Caicedo, Ph. D.2Universidad Industrial de SantanderUniversidad Industrial de SantanderCorporación Colombiana de Investigación Agropecuaria-AGROSAVIAThe identification of influencing factors in crop yield (kg·ha-1) provides essential information for decision-making processes related to the prediction and improvement of productivity, which gives farmers the opportunity to increase their income. The current study investigates the application of multiple machine learning algorithms for cocoa yield prediction and influencing factors identification. The Support Vector Machines (SVM) and Ensemble Learning Models (Random Forests, Gradient Boosting) are compared with Least Absolute Shrinkage and Selection Operator (LASSO) regression models. The considered predictors were climate conditions, cocoa variety, fertilization level and sun exposition in an experimental crop located in Rionegro, Santander. Results showed that Gradient Boosting is the best prediction alternative with Coefficient of determination (R2) = 68%, Mean Absolute Error (MAE) = 13.32, and Root Mean Square Error (RMSE) = 20.41. The crop yield variability is explained mainly by the radiation one month before harvest, the accumulated rainfall on the harvest month, and the temperature one month before harvest. Likewise, the crop yields are evaluated based on the kind of sun exposure, and it was found that radiation one month before harvest is the most influential factor in shade-grown plants. On the other hand, rainfall and soil moisture are determining variables in sun-grown plants, which is associated with the water requirements. These results suggest a differentiated management for crops depending on the kind of sun exposure to avoid compromising productivity, since there is no significant difference in the yield of both agricultural managements.https://revistas.uptc.edu.co/index.php/ingenieria/article/view/10853agricultural-yieldagroforestry-systemcocoamachine-learningpredictionproductivity
spellingShingle Henry Lamos-Díaz, Ph. D.
David Esteban Puentes-Garzón, M.Sc.
Diego Alejandro Zarate-Caicedo, Ph. D.
Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
Revista Facultad de Ingeniería
agricultural-yield
agroforestry-system
cocoa
machine-learning
prediction
productivity
title Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
title_full Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
title_fullStr Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
title_full_unstemmed Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
title_short Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia
title_sort comparison between machine learning models for yield forecast in cocoa crops in santander colombia
topic agricultural-yield
agroforestry-system
cocoa
machine-learning
prediction
productivity
url https://revistas.uptc.edu.co/index.php/ingenieria/article/view/10853
work_keys_str_mv AT henrylamosdiazphd comparisonbetweenmachinelearningmodelsforyieldforecastincocoacropsinsantandercolombia
AT davidestebanpuentesgarzonmsc comparisonbetweenmachinelearningmodelsforyieldforecastincocoacropsinsantandercolombia
AT diegoalejandrozaratecaicedophd comparisonbetweenmachinelearningmodelsforyieldforecastincocoacropsinsantandercolombia