Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms

There are distinct differences between radiation characteristics of volcanic ash and meteorological clouds, and conventional retrieval methods for cloud base height (CBH) of the latter are difficult to apply to volcanic ash without substantial parameterisation and model correction. Furthermore, exis...

Full description

Bibliographic Details
Main Authors: Fenghua Zhao, Jiawei Xia, Lin Zhu, Hongfu Sun, Dexin Zhao
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/14/2/228
_version_ 1797622400090636288
author Fenghua Zhao
Jiawei Xia
Lin Zhu
Hongfu Sun
Dexin Zhao
author_facet Fenghua Zhao
Jiawei Xia
Lin Zhu
Hongfu Sun
Dexin Zhao
author_sort Fenghua Zhao
collection DOAJ
description There are distinct differences between radiation characteristics of volcanic ash and meteorological clouds, and conventional retrieval methods for cloud base height (CBH) of the latter are difficult to apply to volcanic ash without substantial parameterisation and model correction. Furthermore, existing CBH inversion methods have limitations, including the involvement of many empirical formulae and a dependence on the accuracy of upstream cloud products. A machine learning (ML) method was developed for the retrieval of volcanic ash cloud base height (VBH) to reduce uncertainties in physical CBH retrieval methods. This new methodology takes advantage of polar-orbit active remote-sensing data from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), from vertical profile information and from geostationary passive remote-sensing measurements from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) and the Advanced Geostationary Radiation Imager (AGRI) aboard the Meteosat Second Generation (MSG) and FengYun-4B (FY-4B) satellites, respectively. The methodology involves a statistics-based algorithm with hybrid use of principal component analysis (PCA) and one of four ML algorithms including the k-nearest neighbour (KNN), extreme gradient boosting (XGBoost), random forest (RF), and gradient boosting decision tree (GBDT) methods. Eruptions of the Eyjafjallajökull volcano (Iceland) during April-May 2010, the Puyehue-Cordón Caulle volcanic complex (Chilean Andes) in June 2011, and the Hunga Tonga-Hunga Ha’apai volcano (Tonga) in January 2022 were selected as typical cases for the construction of the training and validation sample sets. We demonstrate that a combination of PCA and GBDT performs more accurately than other combinations, with a mean absolute error (MAE) of 1.152 km, a root mean square error (RMSE) of 1.529 km, and a Pearson’s correlation coefficient (r) of 0.724. Use of PCA as an additional process before training reduces feature relevance between input predictors and improves algorithm accuracy. Although the ML algorithm performs well under relatively simple single-layer volcanic ash cloud conditions, it tends to overestimate VBH in multi-layer conditions, which is an unresolved problem in meteorological CBH retrieval.
first_indexed 2024-03-11T09:10:47Z
format Article
id doaj.art-2c4c77867fc54b15b1eb58f6f103f207
institution Directory Open Access Journal
issn 2073-4433
language English
last_indexed 2024-03-11T09:10:47Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj.art-2c4c77867fc54b15b1eb58f6f103f2072023-11-16T19:01:53ZengMDPI AGAtmosphere2073-44332023-01-0114222810.3390/atmos14020228Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning AlgorithmsFenghua Zhao0Jiawei Xia1Lin Zhu2Hongfu Sun3Dexin Zhao4College of Geoscience and Surveying Engineering, China University of Mining and Technology, Beijing 100083, ChinaCollege of Geoscience and Surveying Engineering, China University of Mining and Technology, Beijing 100083, ChinaNational Satellite Meteorological Center, China Meteorological Administration, Beijing 100081, ChinaCollege of Geoscience and Surveying Engineering, China University of Mining and Technology, Beijing 100083, ChinaCollege of Geoscience and Surveying Engineering, China University of Mining and Technology, Beijing 100083, ChinaThere are distinct differences between radiation characteristics of volcanic ash and meteorological clouds, and conventional retrieval methods for cloud base height (CBH) of the latter are difficult to apply to volcanic ash without substantial parameterisation and model correction. Furthermore, existing CBH inversion methods have limitations, including the involvement of many empirical formulae and a dependence on the accuracy of upstream cloud products. A machine learning (ML) method was developed for the retrieval of volcanic ash cloud base height (VBH) to reduce uncertainties in physical CBH retrieval methods. This new methodology takes advantage of polar-orbit active remote-sensing data from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), from vertical profile information and from geostationary passive remote-sensing measurements from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) and the Advanced Geostationary Radiation Imager (AGRI) aboard the Meteosat Second Generation (MSG) and FengYun-4B (FY-4B) satellites, respectively. The methodology involves a statistics-based algorithm with hybrid use of principal component analysis (PCA) and one of four ML algorithms including the k-nearest neighbour (KNN), extreme gradient boosting (XGBoost), random forest (RF), and gradient boosting decision tree (GBDT) methods. Eruptions of the Eyjafjallajökull volcano (Iceland) during April-May 2010, the Puyehue-Cordón Caulle volcanic complex (Chilean Andes) in June 2011, and the Hunga Tonga-Hunga Ha’apai volcano (Tonga) in January 2022 were selected as typical cases for the construction of the training and validation sample sets. We demonstrate that a combination of PCA and GBDT performs more accurately than other combinations, with a mean absolute error (MAE) of 1.152 km, a root mean square error (RMSE) of 1.529 km, and a Pearson’s correlation coefficient (r) of 0.724. Use of PCA as an additional process before training reduces feature relevance between input predictors and improves algorithm accuracy. Although the ML algorithm performs well under relatively simple single-layer volcanic ash cloud conditions, it tends to overestimate VBH in multi-layer conditions, which is an unresolved problem in meteorological CBH retrieval.https://www.mdpi.com/2073-4433/14/2/228volcanic ash cloud base heightmachine learningCALIOP lidar datapassive satellite measurement
spellingShingle Fenghua Zhao
Jiawei Xia
Lin Zhu
Hongfu Sun
Dexin Zhao
Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
Atmosphere
volcanic ash cloud base height
machine learning
CALIOP lidar data
passive satellite measurement
title Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
title_full Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
title_fullStr Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
title_full_unstemmed Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
title_short Retrieval of Volcanic Ash Cloud Base Height Using Machine Learning Algorithms
title_sort retrieval of volcanic ash cloud base height using machine learning algorithms
topic volcanic ash cloud base height
machine learning
CALIOP lidar data
passive satellite measurement
url https://www.mdpi.com/2073-4433/14/2/228
work_keys_str_mv AT fenghuazhao retrievalofvolcanicashcloudbaseheightusingmachinelearningalgorithms
AT jiaweixia retrievalofvolcanicashcloudbaseheightusingmachinelearningalgorithms
AT linzhu retrievalofvolcanicashcloudbaseheightusingmachinelearningalgorithms
AT hongfusun retrievalofvolcanicashcloudbaseheightusingmachinelearningalgorithms
AT dexinzhao retrievalofvolcanicashcloudbaseheightusingmachinelearningalgorithms