Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China

To improve the accuracy of estimating reference crop evapotranspiration for the efficient management of water resources and the optimal design of irrigation scheduling, the drawback of the traditional FAO-56 Penman–Monteith method requiring complete meteorological input variables needs to be overcom...

Full description

Bibliographic Details
Main Authors: Xiaoqiang Liu, Lifeng Wu, Fucang Zhang, Guomin Huang, Fulai Yan, Wenqiang Bai
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/13/23/3478
_version_ 1797507093146632192
author Xiaoqiang Liu
Lifeng Wu
Fucang Zhang
Guomin Huang
Fulai Yan
Wenqiang Bai
author_facet Xiaoqiang Liu
Lifeng Wu
Fucang Zhang
Guomin Huang
Fulai Yan
Wenqiang Bai
author_sort Xiaoqiang Liu
collection DOAJ
description To improve the accuracy of estimating reference crop evapotranspiration for the efficient management of water resources and the optimal design of irrigation scheduling, the drawback of the traditional FAO-56 Penman–Monteith method requiring complete meteorological input variables needs to be overcome. This study evaluates the effects of using five data splitting strategies and three different time lengths of input datasets on predicting ET<sub>0</sub>. The random forest (RF) and extreme gradient boosting (XGB) models coupled with a K-fold cross-validation approach were applied to accomplish this objective. The results showed that the accuracy of the RF (R<sup>2</sup> = 0.862, RMSE = 0.528, MAE = 0.383, NSE = 0.854) was overall better than that of XGB (R<sup>2</sup> = 0.867, RMSE = 0.517, MAE = 0.377, NSE = 0.860) in different input parameters. Both the RF and XGB models with the combination of T<sub>max</sub>, T<sub>min</sub>, and Rs as inputs provided better accuracy on daily ET<sub>0</sub> estimation than the corresponding models with other input combinations. Among all the data splitting strategies, S5 (with a 9:1 proportion) showed the optimal performance. Compared with the length of 30 years, the estimation accuracy of the 50-year length with limited data was reduced, while the length of meteorological data of 10 years improved the accuracy in southern China. Nevertheless, the performance of the 10-year data was the worst among the three time spans when considering the independent test. Therefore, to improve the daily ET<sub>0</sub> predicting performance of the tree-based models in humid regions of China, the random forest model with datasets of 30 years and the 9:1 data splitting strategy is recommended.
first_indexed 2024-03-10T04:42:41Z
format Article
id doaj.art-a763c58796274e38a78913c48cefc0c3
institution Directory Open Access Journal
issn 2073-4441
language English
last_indexed 2024-03-10T04:42:41Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Water
spelling doaj.art-a763c58796274e38a78913c48cefc0c32023-11-23T03:16:12ZengMDPI AGWater2073-44412021-12-011323347810.3390/w13233478Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of ChinaXiaoqiang Liu0Lifeng Wu1Fucang Zhang2Guomin Huang3Fulai Yan4Wenqiang Bai5Key Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas of the Ministry of Education, Northwest A&F University, Yangling, Xianyang 712100, ChinaSchool of Hydraulic and Ecological Engineering, Nanchang Institute of Technology, Nanchang 330099, ChinaKey Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas of the Ministry of Education, Northwest A&F University, Yangling, Xianyang 712100, ChinaSchool of Hydraulic and Ecological Engineering, Nanchang Institute of Technology, Nanchang 330099, ChinaKey Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas of the Ministry of Education, Northwest A&F University, Yangling, Xianyang 712100, ChinaKey Laboratory of Agricultural Soil and Water Engineering in Arid and Semiarid Areas of the Ministry of Education, Northwest A&F University, Yangling, Xianyang 712100, ChinaTo improve the accuracy of estimating reference crop evapotranspiration for the efficient management of water resources and the optimal design of irrigation scheduling, the drawback of the traditional FAO-56 Penman–Monteith method requiring complete meteorological input variables needs to be overcome. This study evaluates the effects of using five data splitting strategies and three different time lengths of input datasets on predicting ET<sub>0</sub>. The random forest (RF) and extreme gradient boosting (XGB) models coupled with a K-fold cross-validation approach were applied to accomplish this objective. The results showed that the accuracy of the RF (R<sup>2</sup> = 0.862, RMSE = 0.528, MAE = 0.383, NSE = 0.854) was overall better than that of XGB (R<sup>2</sup> = 0.867, RMSE = 0.517, MAE = 0.377, NSE = 0.860) in different input parameters. Both the RF and XGB models with the combination of T<sub>max</sub>, T<sub>min</sub>, and Rs as inputs provided better accuracy on daily ET<sub>0</sub> estimation than the corresponding models with other input combinations. Among all the data splitting strategies, S5 (with a 9:1 proportion) showed the optimal performance. Compared with the length of 30 years, the estimation accuracy of the 50-year length with limited data was reduced, while the length of meteorological data of 10 years improved the accuracy in southern China. Nevertheless, the performance of the 10-year data was the worst among the three time spans when considering the independent test. Therefore, to improve the daily ET<sub>0</sub> predicting performance of the tree-based models in humid regions of China, the random forest model with datasets of 30 years and the 9:1 data splitting strategy is recommended.https://www.mdpi.com/2073-4441/13/23/3478data splittinglength of yearsrandom forestextreme gradient boostingreference crop evapotranspiration
spellingShingle Xiaoqiang Liu
Lifeng Wu
Fucang Zhang
Guomin Huang
Fulai Yan
Wenqiang Bai
Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
Water
data splitting
length of years
random forest
extreme gradient boosting
reference crop evapotranspiration
title Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
title_full Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
title_fullStr Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
title_full_unstemmed Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
title_short Splitting and Length of Years for Improving Tree-Based Models to Predict Reference Crop Evapotranspiration in the Humid Regions of China
title_sort splitting and length of years for improving tree based models to predict reference crop evapotranspiration in the humid regions of china
topic data splitting
length of years
random forest
extreme gradient boosting
reference crop evapotranspiration
url https://www.mdpi.com/2073-4441/13/23/3478
work_keys_str_mv AT xiaoqiangliu splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina
AT lifengwu splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina
AT fucangzhang splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina
AT guominhuang splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina
AT fulaiyan splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina
AT wenqiangbai splittingandlengthofyearsforimprovingtreebasedmodelstopredictreferencecropevapotranspirationinthehumidregionsofchina