A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection

The fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring and studying the concentration of other air pollutants. It is crucial to realize more accurate predictions of PM2.5 and establish a high-accuracy PM2.5 prediction model due to t...

Full description

Bibliographic Details
Main Authors: Ali Asghar Heidari, Mehdi Akhoondzadeh, Huiling Chen
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/19/3566
_version_ 1827654030807007232
author Ali Asghar Heidari
Mehdi Akhoondzadeh
Huiling Chen
author_facet Ali Asghar Heidari
Mehdi Akhoondzadeh
Huiling Chen
author_sort Ali Asghar Heidari
collection DOAJ
description The fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring and studying the concentration of other air pollutants. It is crucial to realize more accurate predictions of PM2.5 and establish a high-accuracy PM2.5 prediction model due to their social impacts and cross-field applications in geospatial engineering. To further boost the accuracy of PM2.5 prediction results, this paper proposes a new wavelet PM2.5 prediction system (called WD-OSMSSA-KELM model) based on a new, improved variant of the salp swarm algorithm (OSMSSA), kernel extreme learning machine (KELM), wavelet decomposition, and Boruta-XGBoost (B-XGB) feature selection. First, we applied the B-XGB feature selection to realize the best features for predicting hourly PM2.5 concentrations. Then, we applied the wavelet decomposition (WD) algorithm to reach the multi-scale decomposition results and single-branch reconstruction of PM2.5 concentrations to mitigate the prediction error produced by time series data. In the next stage, we optimized the parameters of the KELM model under each reconstructed component. An improved version of the SSA is proposed to reach higher performance for the basic SSA optimizer and avoid local stagnation problems. In this work, we propose new operators based on oppositional-based learning and simplex-based search to mitigate the core problems of the conventional SSA. In addition, we utilized a time-varying parameter instead of the main parameter of the SSA. To further boost the exploration trends of SSA, we propose using the random leaders to guide the swarm towards new regions of the feature space based on a conditional structure. After optimizing the model, the optimized model was utilized to predict the PM2.5 concentrations, and different error metrics were applied to evaluate the model’s performance and accuracy. The proposed model was evaluated based on an hourly database, six air pollutants, and six meteorological features collected from the Beijing Municipal Environmental Monitoring Center. The experimental results show that the proposed WD-OLMSSA-KELM model can predict the PM2.5 concentration with superior performance (R: 0.995, RMSE: 11.906, MdAE: 2.424, MAPE: 9.768, KGE: 0.963, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula>: 0.990) compared to the WD-CatBoost, WD-LightGBM, WD-Xgboost, and WD-Ridge methods.
first_indexed 2024-03-09T21:27:49Z
format Article
id doaj.art-ab2f4128d33d4f91aa29ed657e6a0ecd
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-09T21:27:49Z
publishDate 2022-09-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-ab2f4128d33d4f91aa29ed657e6a0ecd2023-11-23T21:03:42ZengMDPI AGMathematics2227-73902022-09-011019356610.3390/math10193566A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature SelectionAli Asghar Heidari0Mehdi Akhoondzadeh1Huiling Chen2School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran 1439957131, IranPhotogrammetry and Remote Sensing Department, School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, North Amirabad Ave., Tehran 1439957131, IranDepartment of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325035, ChinaThe fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring and studying the concentration of other air pollutants. It is crucial to realize more accurate predictions of PM2.5 and establish a high-accuracy PM2.5 prediction model due to their social impacts and cross-field applications in geospatial engineering. To further boost the accuracy of PM2.5 prediction results, this paper proposes a new wavelet PM2.5 prediction system (called WD-OSMSSA-KELM model) based on a new, improved variant of the salp swarm algorithm (OSMSSA), kernel extreme learning machine (KELM), wavelet decomposition, and Boruta-XGBoost (B-XGB) feature selection. First, we applied the B-XGB feature selection to realize the best features for predicting hourly PM2.5 concentrations. Then, we applied the wavelet decomposition (WD) algorithm to reach the multi-scale decomposition results and single-branch reconstruction of PM2.5 concentrations to mitigate the prediction error produced by time series data. In the next stage, we optimized the parameters of the KELM model under each reconstructed component. An improved version of the SSA is proposed to reach higher performance for the basic SSA optimizer and avoid local stagnation problems. In this work, we propose new operators based on oppositional-based learning and simplex-based search to mitigate the core problems of the conventional SSA. In addition, we utilized a time-varying parameter instead of the main parameter of the SSA. To further boost the exploration trends of SSA, we propose using the random leaders to guide the swarm towards new regions of the feature space based on a conditional structure. After optimizing the model, the optimized model was utilized to predict the PM2.5 concentrations, and different error metrics were applied to evaluate the model’s performance and accuracy. The proposed model was evaluated based on an hourly database, six air pollutants, and six meteorological features collected from the Beijing Municipal Environmental Monitoring Center. The experimental results show that the proposed WD-OLMSSA-KELM model can predict the PM2.5 concentration with superior performance (R: 0.995, RMSE: 11.906, MdAE: 2.424, MAPE: 9.768, KGE: 0.963, <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><msup><mi>R</mi><mn>2</mn></msup></semantics></math></inline-formula>: 0.990) compared to the WD-CatBoost, WD-LightGBM, WD-Xgboost, and WD-Ridge methods.https://www.mdpi.com/2227-7390/10/19/3566air pollutionoptimizationPM2.5 predictionkernel extreme learning machinemachine learning
spellingShingle Ali Asghar Heidari
Mehdi Akhoondzadeh
Huiling Chen
A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
Mathematics
air pollution
optimization
PM2.5 prediction
kernel extreme learning machine
machine learning
title A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
title_full A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
title_fullStr A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
title_full_unstemmed A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
title_short A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection
title_sort wavelet pm2 5 prediction system using optimized kernel extreme learning with boruta xgboost feature selection
topic air pollution
optimization
PM2.5 prediction
kernel extreme learning machine
machine learning
url https://www.mdpi.com/2227-7390/10/19/3566
work_keys_str_mv AT aliasgharheidari awaveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection
AT mehdiakhoondzadeh awaveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection
AT huilingchen awaveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection
AT aliasgharheidari waveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection
AT mehdiakhoondzadeh waveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection
AT huilingchen waveletpm25predictionsystemusingoptimizedkernelextremelearningwithborutaxgboostfeatureselection