Model Prediction Of Pm2.5 And Pm10 Using Machine Learning Approach

This study was done to develop a multi-input-single-output (MISO) and multi-input-multi-output (MIMO) models using an artificial neural network by MATLAB software to predict the concentrations of PM2.5 and PM10 respectively based on meteorological parameters. For the purpose of this research, the...

Full description

Bibliographic Details
Main Author: Hamid, Norfarhanah
Format: Monograph
Language:English
Published: Universiti Sains Malaysia 2021
Subjects:
Online Access:http://eprints.usm.my/54691/1/Model%20Prediction%20Of%20Pm2.5%20And%20Pm10%20Using%20Machine%20Learning%20Approach_Norfarhanah%20Hamid_K4_2021_ESAR.pdf
Description
Summary:This study was done to develop a multi-input-single-output (MISO) and multi-input-multi-output (MIMO) models using an artificial neural network by MATLAB software to predict the concentrations of PM2.5 and PM10 respectively based on meteorological parameters. For the purpose of this research, the historical dataset is obtained from the Beijing Municipal Environmental Monitoring Centre to be used as the case study. The model was developed as a generic use where data pre-processing using two separate methods of calculating a correlation coefficient and variable importance in projection (VIP) scores managed to select significant input toward output for model development. Both methods of feature selection produced similar results where gaseous pollutants of Carbon Monoxide (CO), Nitrogen Dioxide (NO2) and Sulfur Dioxide (SO2) demonstrated the highest correlation towards the output target. Based on the feature selection, model development was built with and without input selection using the Nonlinear Autoregressive with Exogeneous Input (NARX) neural network model which made use of 10 number of hidden neurons and 2 number of delays, implementing Levenberg-Marquardt as training algorithm. The performance of the prediction model was evaluated by measuring Means Square Error (MSE), Root Mean Square Error (RMSE), Regression Number (R), and Coefficient of Determination (R2) values as a performance validation. Models developed with and without input selections were studied and compared where MISO Model 1, without input selection obtained the best performance having MSE, RMSE, R and R2 with values of 0.0594, 0.2437, 0.9704 and 0.9417 respectively for testing. Meanwhile, with input selection the values obtained 0.0589, 0.2428, 0.9709 and 0.9427. It was found that taking into account the removal of the irrelevant variables does not increase precision significantly nor does it reduce the performance tremendously. Instead, knowing the key parameters with the most relation with PM2.5 and PM10 would guarantee a better predicament of the concentration. Prediction of PM2.5 and PM10 concentration using machine learning is achieved and useful not only to improve public awareness but the air quality management in Malaysia as well as other parts of the world.