Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction

Ozone (O3) is one of the common air pollutants. An increase in the ozone concentration can adversely affect public health and the environment such as vegetation and crops. Therefore, atmospheric air quality monitoring systems were found to monitor and predict ozone concentration. Due to complex form...

Full description

Bibliographic Details
Main Authors: Ayman Yafouz, Nouar AlDahoul, Ahmed H. Birima, Ali Najah Ahmed, Mohsen Sherif, Ahmed Sefelnasr, Mohammed Falah Allawi, Ahmed Elshafie
Format: Article
Language:English
Published: Elsevier 2022-06-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1110016821006918
_version_ 1828223346916982784
author Ayman Yafouz
Nouar AlDahoul
Ahmed H. Birima
Ali Najah Ahmed
Mohsen Sherif
Ahmed Sefelnasr
Mohammed Falah Allawi
Ahmed Elshafie
author_facet Ayman Yafouz
Nouar AlDahoul
Ahmed H. Birima
Ali Najah Ahmed
Mohsen Sherif
Ahmed Sefelnasr
Mohammed Falah Allawi
Ahmed Elshafie
author_sort Ayman Yafouz
collection DOAJ
description Ozone (O3) is one of the common air pollutants. An increase in the ozone concentration can adversely affect public health and the environment such as vegetation and crops. Therefore, atmospheric air quality monitoring systems were found to monitor and predict ozone concentration. Due to complex formation of ozone influenced by precursors of ozone (O3) and meteorological conditions, there is a need to examine and evaluate various machine learning (ML) models for ozone concentration prediction. This study aims to utilize various ML models including Linear Regression (LR), Tree Regression (TR), Support Vector Regression (SVR), Ensemble Regression (ER), Gaussian Process Regression (GPR) and Artificial Neural Networks Models (ANN) to predict tropospheric (O3) using ozone concentration dataset. The dataset was created by observing hourly average data from air quality monitoring systems in 3 different stations including Putrajaya, Kelang, and KL in 3 sites in Peninsular Malaysia. The prediction models have been trained on this dataset and validated by optimizing their hyperparameters. Additionally, the performance of models was evaluated in terms of RMSE, MAE, R2, and training time. The results indicated that LR, SVR, GPR and ANN were able to give the highest R2 (83 % and 89 %) with specific hyperparameters in stations Kelang and KL, respectively. On the other hand, SVR and ER outweigh other models in terms of R2 (79 %) in Putrajaya station. Overall, regardless slightly performance differences, several developed models were able to learn patterns well and provide good prediction performance in terms of R2, RMSE and MAE. Ensemble regression models were found to balance between high prediction accuracy in terms of R2 and low training time and thus considered as a feasible solution for application of Ozone concentration prediction using the data in hourly scenario.
first_indexed 2024-04-12T17:06:13Z
format Article
id doaj.art-90b31e0804934e399dcf8c692dde9088
institution Directory Open Access Journal
issn 1110-0168
language English
last_indexed 2024-04-12T17:06:13Z
publishDate 2022-06-01
publisher Elsevier
record_format Article
series Alexandria Engineering Journal
spelling doaj.art-90b31e0804934e399dcf8c692dde90882022-12-22T03:23:55ZengElsevierAlexandria Engineering Journal1110-01682022-06-0161646074622Comprehensive comparison of various machine learning algorithms for short-term ozone concentration predictionAyman Yafouz0Nouar AlDahoul1Ahmed H. Birima2Ali Najah Ahmed3Mohsen Sherif4Ahmed Sefelnasr5Mohammed Falah Allawi6Ahmed Elshafie7Department of Civil Engineering, College of Engineering, Universiti Tenaga Nasional, 43000 Selangor, MalaysiaFaculty of Engineering, Multimedia University, 63100 Cyberjaya, MalaysiaDepartment of Civil Engineering, College of Engineering, Qassim University, Unaizah, Saudi ArabiaInstitute of Energy Infrastructure (IEI),Department of Civil Engineering, College of Engineering, Universiti Tenaga Nasional (UNITEN), 43000 Selangor, Malaysia; Corresponding author.Civil and Environmental Eng. Dept, College of Engineering, United Arab Emirates University, Al Ain P.O. Box. 15551, United Arab Emirates; National Water and Energy Center, United Arab Emirates University, Al Ain P.O. Box. 15551, United Arab EmiratesNational Water and Energy Center, United Arab Emirates University, Al Ain P.O. Box. 15551, United Arab EmiratesNew Era and Development in Civil Engineering Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar 64001, IraqDepartment of Civil Engineering, Faculty of Engineering, University of Malaya (UM), 50603 Kuala Lumpur, MalaysiaOzone (O3) is one of the common air pollutants. An increase in the ozone concentration can adversely affect public health and the environment such as vegetation and crops. Therefore, atmospheric air quality monitoring systems were found to monitor and predict ozone concentration. Due to complex formation of ozone influenced by precursors of ozone (O3) and meteorological conditions, there is a need to examine and evaluate various machine learning (ML) models for ozone concentration prediction. This study aims to utilize various ML models including Linear Regression (LR), Tree Regression (TR), Support Vector Regression (SVR), Ensemble Regression (ER), Gaussian Process Regression (GPR) and Artificial Neural Networks Models (ANN) to predict tropospheric (O3) using ozone concentration dataset. The dataset was created by observing hourly average data from air quality monitoring systems in 3 different stations including Putrajaya, Kelang, and KL in 3 sites in Peninsular Malaysia. The prediction models have been trained on this dataset and validated by optimizing their hyperparameters. Additionally, the performance of models was evaluated in terms of RMSE, MAE, R2, and training time. The results indicated that LR, SVR, GPR and ANN were able to give the highest R2 (83 % and 89 %) with specific hyperparameters in stations Kelang and KL, respectively. On the other hand, SVR and ER outweigh other models in terms of R2 (79 %) in Putrajaya station. Overall, regardless slightly performance differences, several developed models were able to learn patterns well and provide good prediction performance in terms of R2, RMSE and MAE. Ensemble regression models were found to balance between high prediction accuracy in terms of R2 and low training time and thus considered as a feasible solution for application of Ozone concentration prediction using the data in hourly scenario.http://www.sciencedirect.com/science/article/pii/S1110016821006918Air qualityOzone concentration predictionMachine learningHyperparameter optimization
spellingShingle Ayman Yafouz
Nouar AlDahoul
Ahmed H. Birima
Ali Najah Ahmed
Mohsen Sherif
Ahmed Sefelnasr
Mohammed Falah Allawi
Ahmed Elshafie
Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
Alexandria Engineering Journal
Air quality
Ozone concentration prediction
Machine learning
Hyperparameter optimization
title Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
title_full Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
title_fullStr Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
title_full_unstemmed Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
title_short Comprehensive comparison of various machine learning algorithms for short-term ozone concentration prediction
title_sort comprehensive comparison of various machine learning algorithms for short term ozone concentration prediction
topic Air quality
Ozone concentration prediction
Machine learning
Hyperparameter optimization
url http://www.sciencedirect.com/science/article/pii/S1110016821006918
work_keys_str_mv AT aymanyafouz comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT nouaraldahoul comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT ahmedhbirima comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT alinajahahmed comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT mohsensherif comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT ahmedsefelnasr comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT mohammedfalahallawi comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction
AT ahmedelshafie comprehensivecomparisonofvariousmachinelearningalgorithmsforshorttermozoneconcentrationprediction