Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)

This study aims to produce accurate predictions of the NO<sub>2</sub> concentrations at a specific station of a monitoring network located in the Bay of Algeciras (Spain). Artificial neural networks (ANNs) and sequence-to-sequence long short-term memory networks (LSTMs) were used to crea...

Full description

Bibliographic Details
Main Authors: Javier González-Enrique, Juan Jesús Ruiz-Aguilar, José Antonio Moscoso-López, Daniel Urda, Lipika Deka, Ignacio J. Turias
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/5/1770
_version_ 1797414574740209664
author Javier González-Enrique
Juan Jesús Ruiz-Aguilar
José Antonio Moscoso-López
Daniel Urda
Lipika Deka
Ignacio J. Turias
author_facet Javier González-Enrique
Juan Jesús Ruiz-Aguilar
José Antonio Moscoso-López
Daniel Urda
Lipika Deka
Ignacio J. Turias
author_sort Javier González-Enrique
collection DOAJ
description This study aims to produce accurate predictions of the NO<sub>2</sub> concentrations at a specific station of a monitoring network located in the Bay of Algeciras (Spain). Artificial neural networks (ANNs) and sequence-to-sequence long short-term memory networks (LSTMs) were used to create the forecasting models. Additionally, a new prediction method was proposed combining LSTMs using a rolling window scheme with a cross-validation procedure for time series (LSTM-CVT). Two different strategies were followed regarding the input variables: using NO<sub>2</sub> from the station or employing NO<sub>2</sub> and other pollutants data from any station of the network plus meteorological variables. The ANN and LSTM-CVT exogenous models used lagged datasets of different window sizes. Several feature ranking methods were used to select the top lagged variables and include them in the final exogenous datasets. Prediction horizons of <i>t</i> + 1, <i>t</i> + 4 and <i>t</i> + 8 were employed. The exogenous variables inclusion enhanced the model’s performance, especially for <i>t</i> + 4 (<i>ρ</i> ≈ 0.68 to <i>ρ</i> ≈ 0.74) and <i>t</i> + 8 (<i>ρ</i> ≈ 0.59 to <i>ρ</i> ≈ 0.66). The proposed LSTM-CVT method delivered promising results as the best performing models per prediction horizon employed this new methodology. Additionally, per each parameter combination, it obtained lower error values than ANNs in 85% of the cases.
first_indexed 2024-03-09T05:35:21Z
format Article
id doaj.art-308bf6ac2d86432287ff9b542a328b19
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T05:35:21Z
publishDate 2021-03-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-308bf6ac2d86432287ff9b542a328b192023-12-03T12:29:31ZengMDPI AGSensors1424-82202021-03-01215177010.3390/s21051770Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)Javier González-Enrique0Juan Jesús Ruiz-Aguilar1José Antonio Moscoso-López2Daniel Urda3Lipika Deka4Ignacio J. Turias5Intelligent Modelling of Systems Research Group (MIS), Department of Computer Science Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, SpainIntelligent Modelling of Systems Research Group (MIS), Department of Industrial and Civil Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, SpainIntelligent Modelling of Systems Research Group (MIS), Department of Industrial and Civil Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, SpainGrupo de Inteligencia Computacional Aplicada (GICAP), Departamento de Ingeniería Informática, Escuela Politécnica Superior, Universidad de Burgos, Av. Cantabria s/n, 09006 Burgos, SpainThe De Montfort University Interdisciplinary Group in Intelligent Transport Systems (DIGITS), Department of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UKIntelligent Modelling of Systems Research Group (MIS), Department of Computer Science Engineering, Polytechnic School of Engineering, University of Cádiz, 11204 Algeciras, SpainThis study aims to produce accurate predictions of the NO<sub>2</sub> concentrations at a specific station of a monitoring network located in the Bay of Algeciras (Spain). Artificial neural networks (ANNs) and sequence-to-sequence long short-term memory networks (LSTMs) were used to create the forecasting models. Additionally, a new prediction method was proposed combining LSTMs using a rolling window scheme with a cross-validation procedure for time series (LSTM-CVT). Two different strategies were followed regarding the input variables: using NO<sub>2</sub> from the station or employing NO<sub>2</sub> and other pollutants data from any station of the network plus meteorological variables. The ANN and LSTM-CVT exogenous models used lagged datasets of different window sizes. Several feature ranking methods were used to select the top lagged variables and include them in the final exogenous datasets. Prediction horizons of <i>t</i> + 1, <i>t</i> + 4 and <i>t</i> + 8 were employed. The exogenous variables inclusion enhanced the model’s performance, especially for <i>t</i> + 4 (<i>ρ</i> ≈ 0.68 to <i>ρ</i> ≈ 0.74) and <i>t</i> + 8 (<i>ρ</i> ≈ 0.59 to <i>ρ</i> ≈ 0.66). The proposed LSTM-CVT method delivered promising results as the best performing models per prediction horizon employed this new methodology. Additionally, per each parameter combination, it obtained lower error values than ANNs in 85% of the cases.https://www.mdpi.com/1424-8220/21/5/1770forecastingfeature selectionair pollutionnitrogen dioxideartificial neural networksLSTMs
spellingShingle Javier González-Enrique
Juan Jesús Ruiz-Aguilar
José Antonio Moscoso-López
Daniel Urda
Lipika Deka
Ignacio J. Turias
Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
Sensors
forecasting
feature selection
air pollution
nitrogen dioxide
artificial neural networks
LSTMs
title Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
title_full Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
title_fullStr Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
title_full_unstemmed Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
title_short Artificial Neural Networks, Sequence-to-Sequence LSTMs, and Exogenous Variables as Analytical Tools for NO<sub>2</sub> (Air Pollution) Forecasting: A Case Study in the Bay of Algeciras (Spain)
title_sort artificial neural networks sequence to sequence lstms and exogenous variables as analytical tools for no sub 2 sub air pollution forecasting a case study in the bay of algeciras spain
topic forecasting
feature selection
air pollution
nitrogen dioxide
artificial neural networks
LSTMs
url https://www.mdpi.com/1424-8220/21/5/1770
work_keys_str_mv AT javiergonzalezenrique artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain
AT juanjesusruizaguilar artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain
AT joseantoniomoscosolopez artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain
AT danielurda artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain
AT lipikadeka artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain
AT ignaciojturias artificialneuralnetworkssequencetosequencelstmsandexogenousvariablesasanalyticaltoolsfornosub2subairpollutionforecastingacasestudyinthebayofalgecirasspain