How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?

PM<sub>2.5</sub> refers to the total mass concentration of tiny particulates in the atmosphere near the surface, obtained by means of in situ observations and satellite remote sensing. Given the highly limited number of ground observation stations of inhomogeneous distribution and an ill...

Descrizione completa

Dettagli Bibliografici
Autori principali: Zhongyan Tian, Jing Wei, Zhanqing Li
Natura: Articolo
Lingua:English
Pubblicazione: MDPI AG 2023-07-01
Serie:Remote Sensing
Soggetti:
Accesso online:https://www.mdpi.com/2072-4292/15/15/3780
_version_ 1827730919261208576
author Zhongyan Tian
Jing Wei
Zhanqing Li
author_facet Zhongyan Tian
Jing Wei
Zhanqing Li
author_sort Zhongyan Tian
collection DOAJ
description PM<sub>2.5</sub> refers to the total mass concentration of tiny particulates in the atmosphere near the surface, obtained by means of in situ observations and satellite remote sensing. Given the highly limited number of ground observation stations of inhomogeneous distribution and an ill-posed remote sensing approach, increasing efforts have been devoted to the application of machine-learning (ML) models to both ground and satellite data. A key satellite-derived parameter, aerosol optical thickness (AOD), has been most commonly used as a proxy of PM<sub>2.5</sub>, although their correlation is fraught with large uncertainties. A critical question that has been overlooked concerns how much AOD helps to improve the retrieval of PM<sub>2.5</sub> relative to its uncertainty incurred concurrently. The question is addressed here by taking advantage of high-density PM<sub>2.5</sub> stations in eastern China to evaluate the contributions of AOD, determined as the difference in the accuracy of PM<sub>2.5</sub> retrievals with and without AOD for varying densities of PM<sub>2.5</sub> stations, using four popular ML models (i.e., Random Forest, Extra-trees, XGBoost, and LightGBM). Our results reveal that as the density of monitoring stations decreases, both the feature importance and permutation importance of satellite AOD demonstrate a consistent upward trend (<i>p</i> < 0.05). Furthermore, the ML models without AOD exhibit faster declines in overall accuracy and predictive ability compared with the models with AOD assessed using the sample-based and station-based (spatial) independent cross-validation approaches. Overall, a 10% reduction in the number of stations results in an increase of 0.7–1.2% and 0.6–1.2% in uncertainty in estimated and predicted accuracies, respectively. These findings attest to the indispensable role of satellite AOD in the PM<sub>2.5</sub> retrieval process through ML because it can significantly mitigate the negative impact of the sparse distribution of monitoring sites. This role becomes more important as the number of PM<sub>2.5</sub> stations decreases.
first_indexed 2024-03-11T00:17:23Z
format Article
id doaj.art-502b4513acc04b51a0f8e3ca2c205df1
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-11T00:17:23Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-502b4513acc04b51a0f8e3ca2c205df12023-11-18T23:30:43ZengMDPI AGRemote Sensing2072-42922023-07-011515378010.3390/rs15153780How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?Zhongyan Tian0Jing Wei1Zhanqing Li2Faculty of Geographical Science, Beijing Normal University, Beijing 100875, ChinaDepartment of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740, USADepartment of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20740, USAPM<sub>2.5</sub> refers to the total mass concentration of tiny particulates in the atmosphere near the surface, obtained by means of in situ observations and satellite remote sensing. Given the highly limited number of ground observation stations of inhomogeneous distribution and an ill-posed remote sensing approach, increasing efforts have been devoted to the application of machine-learning (ML) models to both ground and satellite data. A key satellite-derived parameter, aerosol optical thickness (AOD), has been most commonly used as a proxy of PM<sub>2.5</sub>, although their correlation is fraught with large uncertainties. A critical question that has been overlooked concerns how much AOD helps to improve the retrieval of PM<sub>2.5</sub> relative to its uncertainty incurred concurrently. The question is addressed here by taking advantage of high-density PM<sub>2.5</sub> stations in eastern China to evaluate the contributions of AOD, determined as the difference in the accuracy of PM<sub>2.5</sub> retrievals with and without AOD for varying densities of PM<sub>2.5</sub> stations, using four popular ML models (i.e., Random Forest, Extra-trees, XGBoost, and LightGBM). Our results reveal that as the density of monitoring stations decreases, both the feature importance and permutation importance of satellite AOD demonstrate a consistent upward trend (<i>p</i> < 0.05). Furthermore, the ML models without AOD exhibit faster declines in overall accuracy and predictive ability compared with the models with AOD assessed using the sample-based and station-based (spatial) independent cross-validation approaches. Overall, a 10% reduction in the number of stations results in an increase of 0.7–1.2% and 0.6–1.2% in uncertainty in estimated and predicted accuracies, respectively. These findings attest to the indispensable role of satellite AOD in the PM<sub>2.5</sub> retrieval process through ML because it can significantly mitigate the negative impact of the sparse distribution of monitoring sites. This role becomes more important as the number of PM<sub>2.5</sub> stations decreases.https://www.mdpi.com/2072-4292/15/15/3780machine learningAODPM<sub>2.5</sub> retrievalstation densityimportance assessment
spellingShingle Zhongyan Tian
Jing Wei
Zhanqing Li
How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
Remote Sensing
machine learning
AOD
PM<sub>2.5</sub> retrieval
station density
importance assessment
title How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
title_full How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
title_fullStr How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
title_full_unstemmed How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
title_short How Important Is Satellite-Retrieved Aerosol Optical Depth in Deriving Surface PM<sub>2.5</sub> Using Machine Learning?
title_sort how important is satellite retrieved aerosol optical depth in deriving surface pm sub 2 5 sub using machine learning
topic machine learning
AOD
PM<sub>2.5</sub> retrieval
station density
importance assessment
url https://www.mdpi.com/2072-4292/15/15/3780
work_keys_str_mv AT zhongyantian howimportantissatelliteretrievedaerosolopticaldepthinderivingsurfacepmsub25subusingmachinelearning
AT jingwei howimportantissatelliteretrievedaerosolopticaldepthinderivingsurfacepmsub25subusingmachinelearning
AT zhanqingli howimportantissatelliteretrievedaerosolopticaldepthinderivingsurfacepmsub25subusingmachinelearning