Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture

The development of a computationally efficient machine learning-based framework to understand the underlying causes for biases in climate model simulated fields is presented in this study. The framework consists of a two-step approach, with the first step involving the development of a Random Forest...

Full description

Bibliographic Details
Main Authors:	Francisco Andree Ramírez Casas, Laxmi Sushama, Bernardo Teufel
Format:	Article
Language:	English
Published:	MDPI AG 2022-10-01
Series:	Hydrology
Subjects:	machine learning-based climate model validation Random Forest surface soil moisture bias assessment Eastern Canada
Online Access:	https://www.mdpi.com/2306-5338/9/10/186

_version_	1797472918293184512
author	Francisco Andree Ramírez Casas Laxmi Sushama Bernardo Teufel
author_facet	Francisco Andree Ramírez Casas Laxmi Sushama Bernardo Teufel
author_sort	Francisco Andree Ramírez Casas
collection	DOAJ
description	The development of a computationally efficient machine learning-based framework to understand the underlying causes for biases in climate model simulated fields is presented in this study. The framework consists of a two-step approach, with the first step involving the development of a Random Forest (RF) model, trained on observed data of the climate variable of interest and related predictors. The second step involves emulations of the climate variable of interest with the RF model developed in step one by replacing the observed predictors with those from the climate model one at a time. The assumption is that comparing these emulations with that of a reference emulation driven by all observed predictors can shed light on the contribution of respective predictor biases to the biases in the climate model simulation. The proposed framework is used to understand the biases in the Global Environmental Multiscale (GEM) model simulated surface soil moisture (SSM) for the April–September period, over a domain covering part of north-east Canada. The grid cell-based RF model, trained on daily SSM and related climate predictors (water availability, 2 m temperature, relative humidity, snowmelt, maximum snow water equivalent) from the fifth generation European Centre for Medium-Range Weather Forecasts reanalysis (ERA5), demonstrates great skill in emulating SSM, with root mean square error of 0.036. Comparison of the five RF emulations based on GEM predictors with that based on ERA5 predictors suggests that the biases in the mean April–September SSM can be attributed mainly to biases in three predictors: water availability, 2 m temperature and relative humidity. The regions where these predictors contribute to biases in SSM are mostly collocated with the regions where they are shown to be the among the top three influential predictors through the predictor importance analysis, i.e., 2 m temperature in the southern part of the domain, relative humidity in the northern part of the domain and water availability over rest of the domain. The framework, without having to undertake expensive simulations with the climate model, thus successfully identifies the main causes for SSM biases, albeit with slightly reduced skill for heavily perturbed simulations. Furthermore, identification of the causes for biases, by informing targeted climate model improvements, can lead to additional reductions in computational costs.
first_indexed	2024-03-09T20:07:57Z
format	Article
id	doaj.art-91f62cfe10fb47a99c4ca9a85645657d
institution	Directory Open Access Journal
issn	2306-5338
language	English
last_indexed	2024-03-09T20:07:57Z
publishDate	2022-10-01
publisher	MDPI AG
record_format	Article
series	Hydrology
spelling	doaj.art-91f62cfe10fb47a99c4ca9a85645657d2023-11-24T00:25:23ZengMDPI AGHydrology2306-53382022-10-0191018610.3390/hydrology9100186Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil MoistureFrancisco Andree Ramírez Casas0Laxmi Sushama1Bernardo Teufel2Department of Civil Engineering, Trottier Institute for Sustainability in Engineering and Design, McGill University, Montreal, QC H3A OC3, CanadaDepartment of Civil Engineering, Trottier Institute for Sustainability in Engineering and Design, McGill University, Montreal, QC H3A OC3, CanadaDepartment of Civil Engineering, Trottier Institute for Sustainability in Engineering and Design, McGill University, Montreal, QC H3A OC3, CanadaThe development of a computationally efficient machine learning-based framework to understand the underlying causes for biases in climate model simulated fields is presented in this study. The framework consists of a two-step approach, with the first step involving the development of a Random Forest (RF) model, trained on observed data of the climate variable of interest and related predictors. The second step involves emulations of the climate variable of interest with the RF model developed in step one by replacing the observed predictors with those from the climate model one at a time. The assumption is that comparing these emulations with that of a reference emulation driven by all observed predictors can shed light on the contribution of respective predictor biases to the biases in the climate model simulation. The proposed framework is used to understand the biases in the Global Environmental Multiscale (GEM) model simulated surface soil moisture (SSM) for the April–September period, over a domain covering part of north-east Canada. The grid cell-based RF model, trained on daily SSM and related climate predictors (water availability, 2 m temperature, relative humidity, snowmelt, maximum snow water equivalent) from the fifth generation European Centre for Medium-Range Weather Forecasts reanalysis (ERA5), demonstrates great skill in emulating SSM, with root mean square error of 0.036. Comparison of the five RF emulations based on GEM predictors with that based on ERA5 predictors suggests that the biases in the mean April–September SSM can be attributed mainly to biases in three predictors: water availability, 2 m temperature and relative humidity. The regions where these predictors contribute to biases in SSM are mostly collocated with the regions where they are shown to be the among the top three influential predictors through the predictor importance analysis, i.e., 2 m temperature in the southern part of the domain, relative humidity in the northern part of the domain and water availability over rest of the domain. The framework, without having to undertake expensive simulations with the climate model, thus successfully identifies the main causes for SSM biases, albeit with slightly reduced skill for heavily perturbed simulations. Furthermore, identification of the causes for biases, by informing targeted climate model improvements, can lead to additional reductions in computational costs.https://www.mdpi.com/2306-5338/9/10/186machine learning-based climate model validationRandom Forestsurface soil moisturebias assessmentEastern Canada
spellingShingle	Francisco Andree Ramírez Casas Laxmi Sushama Bernardo Teufel Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture Hydrology machine learning-based climate model validation Random Forest surface soil moisture bias assessment Eastern Canada
title	Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture
title_full	Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture
title_fullStr	Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture
title_full_unstemmed	Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture
title_short	Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture
title_sort	development of a machine learning framework to aid climate model assessment and improvement case study of surface soil moisture
topic	machine learning-based climate model validation Random Forest surface soil moisture bias assessment Eastern Canada
url	https://www.mdpi.com/2306-5338/9/10/186
work_keys_str_mv	AT franciscoandreeramirezcasas developmentofamachinelearningframeworktoaidclimatemodelassessmentandimprovementcasestudyofsurfacesoilmoisture AT laxmisushama developmentofamachinelearningframeworktoaidclimatemodelassessmentandimprovementcasestudyofsurfacesoilmoisture AT bernardoteufel developmentofamachinelearningframeworktoaidclimatemodelassessmentandimprovementcasestudyofsurfacesoilmoisture

Development of a Machine Learning Framework to Aid Climate Model Assessment and Improvement: Case Study of Surface Soil Moisture

Similar Items