Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE

Nitrogen dioxide (NO2) is the most active pollutant gas emitted in the industrial era and is highly correlated with human activities. Tracking NO2 emissions and predicting their concentrations represent important steps toward controlling pollution and setting rules to protect people's health in...

Full description

Bibliographic Details
Main Authors: Aishah Al Yammahi, Zeyar Aung
Format: Article
Language:English
Published: Elsevier 2023-02-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844022038725
_version_ 1811161849027100672
author Aishah Al Yammahi
Zeyar Aung
author_facet Aishah Al Yammahi
Zeyar Aung
author_sort Aishah Al Yammahi
collection DOAJ
description Nitrogen dioxide (NO2) is the most active pollutant gas emitted in the industrial era and is highly correlated with human activities. Tracking NO2 emissions and predicting their concentrations represent important steps toward controlling pollution and setting rules to protect people's health indoors, such as in factories, and in outdoor environments. The concentration of NO2 was affected by the COVID-19 lockdown period and decreased because of restrictions on outdoor activities. In this study, the concentration of NO2 was predicted at 14 ground stations in the United Arab Emirates (UAE) during December 2020 based on training over a full time period of two years (2019–2020). Statistical and machine learning models, such as autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA), long short-term memory (LSTM), and nonlinear autoregressive neural network (NAR-NN), are used with both open- and closed-loop architectures. The mean absolute percentage error (MAPE) was used to evaluate the performance of the models, and the results ranged from “very good” (MAPE of 8.64% at the Liwa station with the closed loop) to “acceptable” (MAPE of 42.45% at the Khadejah School station with the open loop). The results show that the predictions based on the open loop are generally better than those based on the closed loop because they yield statistically significantly lower MAPE values. For both loop types, we selected stations exhibiting the lowest, medium, and highest MAPE values as representative cases. In addition, we demonstrated that the MAPE value is highly correlated with the relative standard deviation of NO2 concentration values.
first_indexed 2024-04-10T06:21:47Z
format Article
id doaj.art-85bc2a965aed46d8ba25b74555568458
institution Directory Open Access Journal
issn 2405-8440
language English
last_indexed 2024-04-10T06:21:47Z
publishDate 2023-02-01
publisher Elsevier
record_format Article
series Heliyon
spelling doaj.art-85bc2a965aed46d8ba25b745555684582023-03-02T04:59:44ZengElsevierHeliyon2405-84402023-02-0192e12584Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAEAishah Al Yammahi0Zeyar Aung1Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Corresponding author.Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates; Center for Catalysis and Separation (CeCaS), Khalifa University of Science and Technology, Abu Dhabi, United Arab EmiratesNitrogen dioxide (NO2) is the most active pollutant gas emitted in the industrial era and is highly correlated with human activities. Tracking NO2 emissions and predicting their concentrations represent important steps toward controlling pollution and setting rules to protect people's health indoors, such as in factories, and in outdoor environments. The concentration of NO2 was affected by the COVID-19 lockdown period and decreased because of restrictions on outdoor activities. In this study, the concentration of NO2 was predicted at 14 ground stations in the United Arab Emirates (UAE) during December 2020 based on training over a full time period of two years (2019–2020). Statistical and machine learning models, such as autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA), long short-term memory (LSTM), and nonlinear autoregressive neural network (NAR-NN), are used with both open- and closed-loop architectures. The mean absolute percentage error (MAPE) was used to evaluate the performance of the models, and the results ranged from “very good” (MAPE of 8.64% at the Liwa station with the closed loop) to “acceptable” (MAPE of 42.45% at the Khadejah School station with the open loop). The results show that the predictions based on the open loop are generally better than those based on the closed loop because they yield statistically significantly lower MAPE values. For both loop types, we selected stations exhibiting the lowest, medium, and highest MAPE values as representative cases. In addition, we demonstrated that the MAPE value is highly correlated with the relative standard deviation of NO2 concentration values.http://www.sciencedirect.com/science/article/pii/S2405844022038725Machine learningARIMASARIMALSTMNARClassical statistics
spellingShingle Aishah Al Yammahi
Zeyar Aung
Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
Heliyon
Machine learning
ARIMA
SARIMA
LSTM
NAR
Classical statistics
title Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
title_full Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
title_fullStr Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
title_full_unstemmed Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
title_short Forecasting the concentration of NO2 using statistical and machine learning methods: A case study in the UAE
title_sort forecasting the concentration of no2 using statistical and machine learning methods a case study in the uae
topic Machine learning
ARIMA
SARIMA
LSTM
NAR
Classical statistics
url http://www.sciencedirect.com/science/article/pii/S2405844022038725
work_keys_str_mv AT aishahalyammahi forecastingtheconcentrationofno2usingstatisticalandmachinelearningmethodsacasestudyintheuae
AT zeyaraung forecastingtheconcentrationofno2usingstatisticalandmachinelearningmethodsacasestudyintheuae