Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning

Air pollution is of high relevance to human health. In this study, multiple machine-learning (ML) models—linear regression, random forest (RF), AdaBoost, and neural networks (NNs)—were used to explore the potential impacts of air-pollutant concentrations on the incidence of pediatric respiratory dis...

Full description

Bibliographic Details
Main Authors: Yan Ji, Xiefei Zhi, Ying Wu, Yanqiu Zhang, Yitong Yang, Ting Peng, Luying Ji
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-03-01
Series:Frontiers in Earth Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/feart.2023.1105140/full
_version_ 1811162600509014016
author Yan Ji
Yan Ji
Xiefei Zhi
Xiefei Zhi
Ying Wu
Yanqiu Zhang
Yitong Yang
Ting Peng
Luying Ji
author_facet Yan Ji
Yan Ji
Xiefei Zhi
Xiefei Zhi
Ying Wu
Yanqiu Zhang
Yitong Yang
Ting Peng
Luying Ji
author_sort Yan Ji
collection DOAJ
description Air pollution is of high relevance to human health. In this study, multiple machine-learning (ML) models—linear regression, random forest (RF), AdaBoost, and neural networks (NNs)—were used to explore the potential impacts of air-pollutant concentrations on the incidence of pediatric respiratory diseases in Taizhou, China. A number of explainable artificial intelligence (XAI) methods were further applied to analyze the model outputs and quantify the feature importance. Our results demonstrate that there are significant seasonal variations both in the numbers of pediatric respiratory outpatients and the concentrations of air pollutants. The concentrations of NO2, CO, and particulate matter (PM10 and PM2.5), as well as the numbers of outpatients, reach their peak values in the winter. This indicates that air pollution is a major factor in pediatric respiratory diseases. The results of the regression models show that ML methods can capture the trends and turning points of clinic visits, and the non-linear models were superior to the linear ones. Among them, the RF model served as the best-performing model. The analysis on the RF model by XAI found that AQI, O3, PM10, and the current month are the most important predictors affecting the numbers of pediatric respiratory outpatients. This shows that the number of outpatients rises with an increasing AQI, especially with the increasing of particulate matter. Our study indicates that ML models with XAI methods are promising for revealing the underlying impacts of air pollution on the pediatric respiratory diseases, which further assists the health-related decision-making.
first_indexed 2024-04-10T06:32:04Z
format Article
id doaj.art-9205fb2c63654d29a4c0206da1bf2647
institution Directory Open Access Journal
issn 2296-6463
language English
last_indexed 2024-04-10T06:32:04Z
publishDate 2023-03-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Earth Science
spelling doaj.art-9205fb2c63654d29a4c0206da1bf26472023-03-01T05:35:24ZengFrontiers Media S.A.Frontiers in Earth Science2296-64632023-03-011110.3389/feart.2023.11051401105140Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learningYan Ji0Yan Ji1Xiefei Zhi2Xiefei Zhi3Ying Wu4Yanqiu Zhang5Yitong Yang6Ting Peng7Luying Ji8Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters (CIC-FEMD)/Key Laboratory of Meteorological Disasters, Ministry of Education (KLME), Nanjing University of Information Science and Technology, Nanjing, ChinaWeather Online Institute of Meteorological Applications, Wuxi, ChinaCollaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters (CIC-FEMD)/Key Laboratory of Meteorological Disasters, Ministry of Education (KLME), Nanjing University of Information Science and Technology, Nanjing, ChinaWeather Online Institute of Meteorological Applications, Wuxi, ChinaTaizhou Environmental Monitoring Center, Taizhou, ChinaDepartment of Environmental Occupational Hygiene, Taizhou Center for Disease Control and Prevention, Taizhou, ChinaDepartment of Environmental Occupational Hygiene, Taizhou Center for Disease Control and Prevention, Taizhou, ChinaTaizhou Environmental Monitoring Center, Taizhou, ChinaKey Laboratory of Transportation Meteorology of China Meteorological Administration, Nanjing Joint Institute for Atmospheric Sciences, Nanjing, ChinaAir pollution is of high relevance to human health. In this study, multiple machine-learning (ML) models—linear regression, random forest (RF), AdaBoost, and neural networks (NNs)—were used to explore the potential impacts of air-pollutant concentrations on the incidence of pediatric respiratory diseases in Taizhou, China. A number of explainable artificial intelligence (XAI) methods were further applied to analyze the model outputs and quantify the feature importance. Our results demonstrate that there are significant seasonal variations both in the numbers of pediatric respiratory outpatients and the concentrations of air pollutants. The concentrations of NO2, CO, and particulate matter (PM10 and PM2.5), as well as the numbers of outpatients, reach their peak values in the winter. This indicates that air pollution is a major factor in pediatric respiratory diseases. The results of the regression models show that ML methods can capture the trends and turning points of clinic visits, and the non-linear models were superior to the linear ones. Among them, the RF model served as the best-performing model. The analysis on the RF model by XAI found that AQI, O3, PM10, and the current month are the most important predictors affecting the numbers of pediatric respiratory outpatients. This shows that the number of outpatients rises with an increasing AQI, especially with the increasing of particulate matter. Our study indicates that ML models with XAI methods are promising for revealing the underlying impacts of air pollution on the pediatric respiratory diseases, which further assists the health-related decision-making.https://www.frontiersin.org/articles/10.3389/feart.2023.1105140/fullair pollutantsrespiratory diseases in childrenexplainable artificial intelligence (XAI)feature importance analysisTaizhou city
spellingShingle Yan Ji
Yan Ji
Xiefei Zhi
Xiefei Zhi
Ying Wu
Yanqiu Zhang
Yitong Yang
Ting Peng
Luying Ji
Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
Frontiers in Earth Science
air pollutants
respiratory diseases in children
explainable artificial intelligence (XAI)
feature importance analysis
Taizhou city
title Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
title_full Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
title_fullStr Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
title_full_unstemmed Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
title_short Regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
title_sort regression analysis of air pollution and pediatric respiratory diseases based on interpretable machine learning
topic air pollutants
respiratory diseases in children
explainable artificial intelligence (XAI)
feature importance analysis
Taizhou city
url https://www.frontiersin.org/articles/10.3389/feart.2023.1105140/full
work_keys_str_mv AT yanji regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT yanji regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT xiefeizhi regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT xiefeizhi regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT yingwu regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT yanqiuzhang regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT yitongyang regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT tingpeng regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning
AT luyingji regressionanalysisofairpollutionandpediatricrespiratorydiseasesbasedoninterpretablemachinelearning