Machine learning for yield prediction in Fergana valley, Central Asia

Accurate yield prediction is essential for growers, researchers, governments, the farming industry, and policymakers for social peace, food safety, security, and sustainable development. The results of earlier techniques of data collecting and analysis for yield forecasts were typically delayed, exp...

Full description

Bibliographic Details
Main Authors: Mukesh Singh Boori, Komal Choudhary, Rustam Paringer, Alexander Kupriyanov
Format: Article
Language:English
Published: Elsevier 2023-02-01
Series:Journal of the Saudi Society of Agricultural Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1658077X22000777
_version_ 1811161838621032448
author Mukesh Singh Boori
Komal Choudhary
Rustam Paringer
Alexander Kupriyanov
author_facet Mukesh Singh Boori
Komal Choudhary
Rustam Paringer
Alexander Kupriyanov
author_sort Mukesh Singh Boori
collection DOAJ
description Accurate yield prediction is essential for growers, researchers, governments, the farming industry, and policymakers for social peace, food safety, security, and sustainable development. The results of earlier techniques of data collecting and analysis for yield forecasts were typically delayed, expensive, time-consuming, site-specific, and riddled with errors and uncertainties. This study is a novel approach to using high-resolution satellite data in conjunction with environmental and topographic data to predict wheat yield variability at the farm scale using machine learning. In this research, winter wheat yield prediction was based on 36 indicators in machine learning using correlation and different regression models. Winter wheat yield was predicted using linear regression (LR), decision tree (DT), and random forest (RF) regression models with scikit-learn in machine learning. More than 10,000 data points from 45 farms were trained and validated in Fergana valley, Central Asia. Results indicate that at 10 m resolution using Sentinel-2 and other secondary data such as topographic, soil, environmental, and filed data can generate an accurate wheat yield prediction map. The accuracy of all regressions is lowest for LR (R2:95, RMSE: 2.31), highest for RF (R2:98, RMSE: 1.40), and intermediate for DT regression (R2:97, RMSE: 1.85). Results also indicate that prediction in the early stage of the crop is less accurate in comparison to harvesting time as LR (R2:85, RMSE: 2.66), DT (R2:95, RMSE: 2.06), RF (R2:97, RMSE: 1.54) have different R2 and RMSE values. Appling the RF model, the winter wheat prediction is 3.29 to 4.30 t/ha therefore the total wheat production is approximately 100 t in the study area. Thus this study will demonstrate the capability of high-resolution satellite imagery and secondary data for highly accurate real-time crop yield prediction at the field scale, which can be used to assist precision agriculture and will provide a point of reference for crop area extraction, mapping, monitoring, and sustainable development with food security.
first_indexed 2024-04-10T06:21:41Z
format Article
id doaj.art-6ecf94aa77214498a4697e61832c80b4
institution Directory Open Access Journal
issn 1658-077X
language English
last_indexed 2024-04-10T06:21:41Z
publishDate 2023-02-01
publisher Elsevier
record_format Article
series Journal of the Saudi Society of Agricultural Sciences
spelling doaj.art-6ecf94aa77214498a4697e61832c80b42023-03-02T04:59:02ZengElsevierJournal of the Saudi Society of Agricultural Sciences1658-077X2023-02-01222107120Machine learning for yield prediction in Fergana valley, Central AsiaMukesh Singh Boori0Komal Choudhary1Rustam Paringer2Alexander Kupriyanov3Scientific Research Laboratory of Automated Systems of Scientific Research (SRL-35), Samara National Research University, Samara, Russia; Corresponding authors.Scientific Research Laboratory of Automated Systems of Scientific Research (SRL-35), Samara National Research University, Samara, Russia; Department of Land Surveying and Geo- Informatics, Smart Cities Research Institute, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China; Corresponding authors.Scientific Research Laboratory of Automated Systems of Scientific Research (SRL-35), Samara National Research University, Samara, Russia; Image Processing Systems Institute of the RAS–Branch of the FSRC “Crystallography and Photonics”, Samara, RussiaScientific Research Laboratory of Automated Systems of Scientific Research (SRL-35), Samara National Research University, Samara, Russia; Image Processing Systems Institute of the RAS–Branch of the FSRC “Crystallography and Photonics”, Samara, RussiaAccurate yield prediction is essential for growers, researchers, governments, the farming industry, and policymakers for social peace, food safety, security, and sustainable development. The results of earlier techniques of data collecting and analysis for yield forecasts were typically delayed, expensive, time-consuming, site-specific, and riddled with errors and uncertainties. This study is a novel approach to using high-resolution satellite data in conjunction with environmental and topographic data to predict wheat yield variability at the farm scale using machine learning. In this research, winter wheat yield prediction was based on 36 indicators in machine learning using correlation and different regression models. Winter wheat yield was predicted using linear regression (LR), decision tree (DT), and random forest (RF) regression models with scikit-learn in machine learning. More than 10,000 data points from 45 farms were trained and validated in Fergana valley, Central Asia. Results indicate that at 10 m resolution using Sentinel-2 and other secondary data such as topographic, soil, environmental, and filed data can generate an accurate wheat yield prediction map. The accuracy of all regressions is lowest for LR (R2:95, RMSE: 2.31), highest for RF (R2:98, RMSE: 1.40), and intermediate for DT regression (R2:97, RMSE: 1.85). Results also indicate that prediction in the early stage of the crop is less accurate in comparison to harvesting time as LR (R2:85, RMSE: 2.66), DT (R2:95, RMSE: 2.06), RF (R2:97, RMSE: 1.54) have different R2 and RMSE values. Appling the RF model, the winter wheat prediction is 3.29 to 4.30 t/ha therefore the total wheat production is approximately 100 t in the study area. Thus this study will demonstrate the capability of high-resolution satellite imagery and secondary data for highly accurate real-time crop yield prediction at the field scale, which can be used to assist precision agriculture and will provide a point of reference for crop area extraction, mapping, monitoring, and sustainable development with food security.http://www.sciencedirect.com/science/article/pii/S1658077X22000777Yield predictionRegressionSentinel-2Spectral-indicesMachine learningPhenology
spellingShingle Mukesh Singh Boori
Komal Choudhary
Rustam Paringer
Alexander Kupriyanov
Machine learning for yield prediction in Fergana valley, Central Asia
Journal of the Saudi Society of Agricultural Sciences
Yield prediction
Regression
Sentinel-2
Spectral-indices
Machine learning
Phenology
title Machine learning for yield prediction in Fergana valley, Central Asia
title_full Machine learning for yield prediction in Fergana valley, Central Asia
title_fullStr Machine learning for yield prediction in Fergana valley, Central Asia
title_full_unstemmed Machine learning for yield prediction in Fergana valley, Central Asia
title_short Machine learning for yield prediction in Fergana valley, Central Asia
title_sort machine learning for yield prediction in fergana valley central asia
topic Yield prediction
Regression
Sentinel-2
Spectral-indices
Machine learning
Phenology
url http://www.sciencedirect.com/science/article/pii/S1658077X22000777
work_keys_str_mv AT mukeshsinghboori machinelearningforyieldpredictioninferganavalleycentralasia
AT komalchoudhary machinelearningforyieldpredictioninferganavalleycentralasia
AT rustamparinger machinelearningforyieldpredictioninferganavalleycentralasia
AT alexanderkupriyanov machinelearningforyieldpredictioninferganavalleycentralasia