Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective

The air quality in China has been improved substantially, however fine particulate matter (PM2.5) still remain at a high level in many areas. PM2.5 pollution is a complex process that is attributed to gaseous precursors, chemical, and meteorological factors. Quantifying the contribution of each vari...

Full description

Bibliographic Details
Main Authors: Tianshuai Li, Qingzhu Zhang, Yanbo Peng, Xu Guan, Lei Li, Jiangshan Mu, Xinfeng Wang, Xianwei Yin, Qiao Wang
Format: Article
Language:English
Published: Elsevier 2023-03-01
Series:Environment International
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S0160412023001344
_version_ 1827987860214513664
author Tianshuai Li
Qingzhu Zhang
Yanbo Peng
Xu Guan
Lei Li
Jiangshan Mu
Xinfeng Wang
Xianwei Yin
Qiao Wang
author_facet Tianshuai Li
Qingzhu Zhang
Yanbo Peng
Xu Guan
Lei Li
Jiangshan Mu
Xinfeng Wang
Xianwei Yin
Qiao Wang
author_sort Tianshuai Li
collection DOAJ
description The air quality in China has been improved substantially, however fine particulate matter (PM2.5) still remain at a high level in many areas. PM2.5 pollution is a complex process that is attributed to gaseous precursors, chemical, and meteorological factors. Quantifying the contribution of each variable to air pollution can facilitate the formulation of effective policies to precisely eliminate air pollution. In this study, we first used decision plot to map out the decision process of the Random Forest (RF) model for a single hourly data set and constructed a framework for analyzing the causes of air pollution using multiple interpretable methods. Permutation importance was used to qualitatively analyze the effect of each variable on PM2.5 concentrations. The sensitivity of secondary inorganic aerosols (SIA): SO42-, NO3- and NH4+ to PM2.5 was verified by Partial dependence plot (PDP). Shapley Additive Explanation (Shapley) was used to quantify the contribution of drivers behind the ten air pollution events. The RF model can accurately predict PM2.5 concentrations, with determination coefficient (R2) of 0.94, root mean square error (RMSE) and mean absolute error (MAE) of 9.4 μg/m3 and 5.7 μg/m3, respectively. This study revealed that the order of sensitivity of SIA to PM2.5 was NH4+>NO3->SO42-. Fossil fuel and biomass combustion may be contributing factors to air pollution events in Zibo in 2021 autumn–winter. NH4+ contributed 19.9–65.4 μg/m3 among ten air pollution events (APs). K, NO3-, EC and OC were the other main drivers, contributing 8.7 ± 2.7 μg/m3, 6.8 ± 7.5 μg/m3, 3.6 ± 5.8 μg/m3 and 2.5 ± 2.0 μg/m3, respectively. Lower temperature and higher humidity were vital factors that promoted the formation of NO3-. Our study may provide a methodological framework for precise air pollution management.
first_indexed 2024-04-09T23:54:41Z
format Article
id doaj.art-73c03124437e4efd9d3d7e727bcc6142
institution Directory Open Access Journal
issn 0160-4120
language English
last_indexed 2024-04-09T23:54:41Z
publishDate 2023-03-01
publisher Elsevier
record_format Article
series Environment International
spelling doaj.art-73c03124437e4efd9d3d7e727bcc61422023-03-17T04:32:27ZengElsevierEnvironment International0160-41202023-03-01173107861Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspectiveTianshuai Li0Qingzhu Zhang1Yanbo Peng2Xu Guan3Lei Li4Jiangshan Mu5Xinfeng Wang6Xianwei Yin7Qiao Wang8Big Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR China; Shandong Academy for Environmental Planning, Jinan 250101, PR China; Corresponding author at: Big Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR China.Shandong Academy for Environmental Planning, Jinan 250101, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaZibo Ecological Environment Monitoring Center of Shandong Province, Zibo 255040, PR ChinaBig Data Research Center for Ecology and Environment, Environment Research Institute, Shandong University, Qingdao 266003, PR ChinaThe air quality in China has been improved substantially, however fine particulate matter (PM2.5) still remain at a high level in many areas. PM2.5 pollution is a complex process that is attributed to gaseous precursors, chemical, and meteorological factors. Quantifying the contribution of each variable to air pollution can facilitate the formulation of effective policies to precisely eliminate air pollution. In this study, we first used decision plot to map out the decision process of the Random Forest (RF) model for a single hourly data set and constructed a framework for analyzing the causes of air pollution using multiple interpretable methods. Permutation importance was used to qualitatively analyze the effect of each variable on PM2.5 concentrations. The sensitivity of secondary inorganic aerosols (SIA): SO42-, NO3- and NH4+ to PM2.5 was verified by Partial dependence plot (PDP). Shapley Additive Explanation (Shapley) was used to quantify the contribution of drivers behind the ten air pollution events. The RF model can accurately predict PM2.5 concentrations, with determination coefficient (R2) of 0.94, root mean square error (RMSE) and mean absolute error (MAE) of 9.4 μg/m3 and 5.7 μg/m3, respectively. This study revealed that the order of sensitivity of SIA to PM2.5 was NH4+>NO3->SO42-. Fossil fuel and biomass combustion may be contributing factors to air pollution events in Zibo in 2021 autumn–winter. NH4+ contributed 19.9–65.4 μg/m3 among ten air pollution events (APs). K, NO3-, EC and OC were the other main drivers, contributing 8.7 ± 2.7 μg/m3, 6.8 ± 7.5 μg/m3, 3.6 ± 5.8 μg/m3 and 2.5 ± 2.0 μg/m3, respectively. Lower temperature and higher humidity were vital factors that promoted the formation of NO3-. Our study may provide a methodological framework for precise air pollution management.http://www.sciencedirect.com/science/article/pii/S0160412023001344PM2.5Air pollutionMachine learningPermutation importancePDPSHAP
spellingShingle Tianshuai Li
Qingzhu Zhang
Yanbo Peng
Xu Guan
Lei Li
Jiangshan Mu
Xinfeng Wang
Xianwei Yin
Qiao Wang
Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
Environment International
PM2.5
Air pollution
Machine learning
Permutation importance
PDP
SHAP
title Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
title_full Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
title_fullStr Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
title_full_unstemmed Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
title_short Contributions of various driving factors to air pollution events: Interpretability analysis from Machine learning perspective
title_sort contributions of various driving factors to air pollution events interpretability analysis from machine learning perspective
topic PM2.5
Air pollution
Machine learning
Permutation importance
PDP
SHAP
url http://www.sciencedirect.com/science/article/pii/S0160412023001344
work_keys_str_mv AT tianshuaili contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT qingzhuzhang contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT yanbopeng contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT xuguan contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT leili contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT jiangshanmu contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT xinfengwang contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT xianweiyin contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective
AT qiaowang contributionsofvariousdrivingfactorstoairpollutioneventsinterpretabilityanalysisfrommachinelearningperspective