GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning

<p>An accurate spatially continuous air temperature data set is crucial for multiple applications in the environmental and ecological sciences. Existing spatial interpolation methods have relatively low accuracy, and the resolution of available long-term gridded products of air temperature for...

Full description

Bibliographic Details
Main Authors: Q. He, M. Wang, K. Liu, K. Li, Z. Jiang
Format: Article
Language:English
Published: Copernicus Publications 2022-07-01
Series:Earth System Science Data
Online Access:https://essd.copernicus.org/articles/14/3273/2022/essd-14-3273-2022.pdf
_version_ 1828461935613444096
author Q. He
Q. He
M. Wang
M. Wang
K. Liu
K. Liu
K. Li
K. Li
Z. Jiang
Z. Jiang
author_facet Q. He
Q. He
M. Wang
M. Wang
K. Liu
K. Liu
K. Li
K. Li
Z. Jiang
Z. Jiang
author_sort Q. He
collection DOAJ
description <p>An accurate spatially continuous air temperature data set is crucial for multiple applications in the environmental and ecological sciences. Existing spatial interpolation methods have relatively low accuracy, and the resolution of available long-term gridded products of air temperature for China is coarse. Point observations from meteorological stations can provide long-term air temperature data series but cannot represent spatially continuous information. Here, we devised a method for spatial interpolation of air temperature data from meteorological stations based on powerful machine learning tools. First, to determine the optimal method for interpolation of air temperature data, we employed three machine learning models: random forest, support vector machine, and Gaussian process regression. A comparison of the mean absolute error, root mean square error, coefficient of determination, and residuals revealed that a Gaussian process regression had high accuracy and clearly outperformed the other two models regarding the interpolation of monthly maximum, minimum, and mean air temperatures. The machine learning methods were compared with three traditional methods used frequently for spatial interpolation: inverse distance weighting, ordinary kriging, and ANUSPLIN (Australian National University Spline). Results showed that the Gaussian process regression model had higher accuracy and greater robustness than the traditional methods regarding interpolation of monthly maximum, minimum, and mean air temperatures in each month. A comparison with the TerraClimate (Monthly Climate and Climatic Water Balance for Global Terrestrial Surfaces), FLDAS (Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System), and ERA5 (ECMWF, European Centre for Medium-Range Weather Forecasts, Climate Reanalysis) data sets revealed that the accuracy of the temperature data generated using the Gaussian process regression model was higher. Finally, using the Gaussian process regression method, we produced a long-term (January 1951 to December 2020) gridded monthly air temperature data set, with 1 km resolution and high accuracy for China, which we named GPRChinaTemp1km. The data set consists of three variables: monthly mean air temperature, monthly maximum air temperature, and monthly minimum air temperature. The obtained GPRChinaTemp1km data were used to analyse the spatiotemporal variations of air temperature using Theil–Sen median trend analysis in combination with the Mann–Kendall test. It was found that the monthly mean and minimum air temperatures across China were characterised by a significant trend of increase in each month, whereas monthly maximum air temperatures showed a more spatially heterogeneous pattern, with significant increase, non-significant increase, and non-significant decrease. The GPRChinaTemp1km data set is publicly available at <a href="https://doi.org/10.5281/zenodo.5112122">https://doi.org/10.5281/zenodo.5112122</a> (He et al., 2021a) for monthly maximum air temperature, at <a href="https://doi.org/10.5281/zenodo.5111989">https://doi.org/10.5281/zenodo.5111989</a> (He et al., 2021b) for monthly mean air temperature, and at <a href="https://doi.org/10.5281/zenodo.5112232">https://doi.org/10.5281/zenodo.5112232</a> (He et al., 2021c) for monthly minimum air temperature.</p>
first_indexed 2024-12-11T02:27:29Z
format Article
id doaj.art-d3d13c585866437795075efd42586bf4
institution Directory Open Access Journal
issn 1866-3508
1866-3516
language English
last_indexed 2024-12-11T02:27:29Z
publishDate 2022-07-01
publisher Copernicus Publications
record_format Article
series Earth System Science Data
spelling doaj.art-d3d13c585866437795075efd42586bf42022-12-22T01:23:54ZengCopernicus PublicationsEarth System Science Data1866-35081866-35162022-07-01143273329210.5194/essd-14-3273-2022GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learningQ. He0Q. He1M. Wang2M. Wang3K. Liu4K. Liu5K. Li6K. Li7Z. Jiang8Z. Jiang9Academy of Disaster Reduction and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaFaculty of Geographical Science, Beijing Normal University, 100875 Beijing, ChinaAcademy of Disaster Reduction and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaSchool of National Safety and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaAcademy of Disaster Reduction and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaSchool of National Safety and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaAcademy of Disaster Reduction and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaFaculty of Geographical Science, Beijing Normal University, 100875 Beijing, ChinaAcademy of Disaster Reduction and Emergency Management, Beijing Normal University, 100875 Beijing, ChinaFaculty of Geographical Science, Beijing Normal University, 100875 Beijing, China<p>An accurate spatially continuous air temperature data set is crucial for multiple applications in the environmental and ecological sciences. Existing spatial interpolation methods have relatively low accuracy, and the resolution of available long-term gridded products of air temperature for China is coarse. Point observations from meteorological stations can provide long-term air temperature data series but cannot represent spatially continuous information. Here, we devised a method for spatial interpolation of air temperature data from meteorological stations based on powerful machine learning tools. First, to determine the optimal method for interpolation of air temperature data, we employed three machine learning models: random forest, support vector machine, and Gaussian process regression. A comparison of the mean absolute error, root mean square error, coefficient of determination, and residuals revealed that a Gaussian process regression had high accuracy and clearly outperformed the other two models regarding the interpolation of monthly maximum, minimum, and mean air temperatures. The machine learning methods were compared with three traditional methods used frequently for spatial interpolation: inverse distance weighting, ordinary kriging, and ANUSPLIN (Australian National University Spline). Results showed that the Gaussian process regression model had higher accuracy and greater robustness than the traditional methods regarding interpolation of monthly maximum, minimum, and mean air temperatures in each month. A comparison with the TerraClimate (Monthly Climate and Climatic Water Balance for Global Terrestrial Surfaces), FLDAS (Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System), and ERA5 (ECMWF, European Centre for Medium-Range Weather Forecasts, Climate Reanalysis) data sets revealed that the accuracy of the temperature data generated using the Gaussian process regression model was higher. Finally, using the Gaussian process regression method, we produced a long-term (January 1951 to December 2020) gridded monthly air temperature data set, with 1 km resolution and high accuracy for China, which we named GPRChinaTemp1km. The data set consists of three variables: monthly mean air temperature, monthly maximum air temperature, and monthly minimum air temperature. The obtained GPRChinaTemp1km data were used to analyse the spatiotemporal variations of air temperature using Theil–Sen median trend analysis in combination with the Mann–Kendall test. It was found that the monthly mean and minimum air temperatures across China were characterised by a significant trend of increase in each month, whereas monthly maximum air temperatures showed a more spatially heterogeneous pattern, with significant increase, non-significant increase, and non-significant decrease. The GPRChinaTemp1km data set is publicly available at <a href="https://doi.org/10.5281/zenodo.5112122">https://doi.org/10.5281/zenodo.5112122</a> (He et al., 2021a) for monthly maximum air temperature, at <a href="https://doi.org/10.5281/zenodo.5111989">https://doi.org/10.5281/zenodo.5111989</a> (He et al., 2021b) for monthly mean air temperature, and at <a href="https://doi.org/10.5281/zenodo.5112232">https://doi.org/10.5281/zenodo.5112232</a> (He et al., 2021c) for monthly minimum air temperature.</p>https://essd.copernicus.org/articles/14/3273/2022/essd-14-3273-2022.pdf
spellingShingle Q. He
Q. He
M. Wang
M. Wang
K. Liu
K. Liu
K. Li
K. Li
Z. Jiang
Z. Jiang
GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
Earth System Science Data
title GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
title_full GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
title_fullStr GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
title_full_unstemmed GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
title_short GPRChinaTemp1km: a high-resolution monthly air temperature data set for China (1951–2020) based on machine learning
title_sort gprchinatemp1km a high resolution monthly air temperature data set for china 1951 2020 based on machine learning
url https://essd.copernicus.org/articles/14/3273/2022/essd-14-3273-2022.pdf
work_keys_str_mv AT qhe gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT qhe gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT mwang gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT mwang gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT kliu gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT kliu gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT kli gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT kli gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT zjiang gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning
AT zjiang gprchinatemp1kmahighresolutionmonthlyairtemperaturedatasetforchina19512020basedonmachinelearning