Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest

Fine particulate matter with an aerodynamic diameter less than 2.5 µm (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant=&q...

Full description

Bibliographic Details
Main Authors: Luo Zhang, Zhengqiang Li, Jie Guang, Yisong Xie, Zheng Shi, Haoran Gu, Yang Zheng
Format: Article
Language:English
Published: MDPI AG 2024-03-01
Series:Atmosphere
Subjects:
Online Access:https://www.mdpi.com/2073-4433/15/3/384
_version_ 1827307121009491968
author Luo Zhang
Zhengqiang Li
Jie Guang
Yisong Xie
Zheng Shi
Haoran Gu
Yang Zheng
author_facet Luo Zhang
Zhengqiang Li
Jie Guang
Yisong Xie
Zheng Shi
Haoran Gu
Yang Zheng
author_sort Luo Zhang
collection DOAJ
description Fine particulate matter with an aerodynamic diameter less than 2.5 µm (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula>) profoundly affects environmental systems, human health and economic structures. Multi-source data and advanced machine or deep-learning methods have provided a new chance for estimating the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> concentrations at a high spatiotemporal resolution. In this paper, the Random Forest (RF) algorithm was applied to estimate hourly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> of the North China area (Beijing–Tianjin–Hebei, BTH) based on the next-generation geostationary meteorological satellite Himawari-8/AHI (Advanced Himawari Imager) aerosol optical depth (AOD) products. To improve the estimation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> concentration across large areas, we construct a method for co-weighting the environmental similarity and the geographical distances by using an attention mechanism so that it can efficiently characterize the influence of spatial–temporal information hidden in adjacent ground monitoring sites. In experiment results, the hourly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> estimates are well correlated with ground measurements in BTH, with a coefficient of determination (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></semantics></math></inline-formula>) of 0.887, a root-mean-square error (RMSE) of 18.31 <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi mathvariant="sans-serif">μ</mi></semantics></math></inline-formula>g/<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">m</mi><mn>3</mn></msup></semantics></math></inline-formula>, and a mean absolute error (MAE) of 11.17 µg/<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">m</mi><mn>3</mn></msup></semantics></math></inline-formula>, indicating good model performance. In addition, this paper makes a comprehensive analysis of the effectiveness of multi-source data in the estimation process, in this way, to simplify the model structure and improve the estimation efficiency of the model while ensuring its accuracy.
first_indexed 2024-04-24T18:33:53Z
format Article
id doaj.art-e9dffab1a829425298be29ac55abf155
institution Directory Open Access Journal
issn 2073-4433
language English
last_indexed 2024-04-24T18:33:53Z
publishDate 2024-03-01
publisher MDPI AG
record_format Article
series Atmosphere
spelling doaj.art-e9dffab1a829425298be29ac55abf1552024-03-27T13:20:52ZengMDPI AGAtmosphere2073-44332024-03-0115338410.3390/atmos15030384Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random ForestLuo Zhang0Zhengqiang Li1Jie Guang2Yisong Xie3Zheng Shi4Haoran Gu5Yang Zheng6State Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaState Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaState Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaState Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaThe Administrative Center for China’s Agenda 21, Beijing 100038, ChinaState Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaState Environmental Protection Key Laboratory of Satellite Remote Sensing, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, ChinaFine particulate matter with an aerodynamic diameter less than 2.5 µm (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula>) profoundly affects environmental systems, human health and economic structures. Multi-source data and advanced machine or deep-learning methods have provided a new chance for estimating the <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> concentrations at a high spatiotemporal resolution. In this paper, the Random Forest (RF) algorithm was applied to estimate hourly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> of the North China area (Beijing–Tianjin–Hebei, BTH) based on the next-generation geostationary meteorological satellite Himawari-8/AHI (Advanced Himawari Imager) aerosol optical depth (AOD) products. To improve the estimation of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> concentration across large areas, we construct a method for co-weighting the environmental similarity and the geographical distances by using an attention mechanism so that it can efficiently characterize the influence of spatial–temporal information hidden in adjacent ground monitoring sites. In experiment results, the hourly <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi mathvariant="normal">P</mi><msub><mi mathvariant="normal">M</mi><mn>2.5</mn></msub></mrow></semantics></math></inline-formula> estimates are well correlated with ground measurements in BTH, with a coefficient of determination (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">R</mi><mn>2</mn></msup></semantics></math></inline-formula>) of 0.887, a root-mean-square error (RMSE) of 18.31 <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi mathvariant="sans-serif">μ</mi></semantics></math></inline-formula>g/<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">m</mi><mn>3</mn></msup></semantics></math></inline-formula>, and a mean absolute error (MAE) of 11.17 µg/<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi mathvariant="normal">m</mi><mn>3</mn></msup></semantics></math></inline-formula>, indicating good model performance. In addition, this paper makes a comprehensive analysis of the effectiveness of multi-source data in the estimation process, in this way, to simplify the model structure and improve the estimation efficiency of the model while ensuring its accuracy.https://www.mdpi.com/2073-4433/15/3/384PM<sub>2.5</sub>random forestattention mechanismspatiotemporal predictionmulti-source data
spellingShingle Luo Zhang
Zhengqiang Li
Jie Guang
Yisong Xie
Zheng Shi
Haoran Gu
Yang Zheng
Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
Atmosphere
PM<sub>2.5</sub>
random forest
attention mechanism
spatiotemporal prediction
multi-source data
title Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
title_full Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
title_fullStr Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
title_full_unstemmed Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
title_short Improving the Estimation of PM<sub>2.5</sub> Concentration in the North China Area by Introducing an Attention Mechanism into Random Forest
title_sort improving the estimation of pm sub 2 5 sub concentration in the north china area by introducing an attention mechanism into random forest
topic PM<sub>2.5</sub>
random forest
attention mechanism
spatiotemporal prediction
multi-source data
url https://www.mdpi.com/2073-4433/15/3/384
work_keys_str_mv AT luozhang improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT zhengqiangli improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT jieguang improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT yisongxie improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT zhengshi improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT haorangu improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest
AT yangzheng improvingtheestimationofpmsub25subconcentrationinthenorthchinaareabyintroducinganattentionmechanismintorandomforest