Predicting Benzene Concentration Using Machine Learning and Time Series Algorithms

Benzene is a pollutant which is very harmful to our health, so models are necessary to predict its concentration and relationship with other air pollutants. The data collected by eight stations in Madrid (Spain) over nine years were analyzed using the following regression-based machine learning mode...

Full description

Bibliographic Details
Main Authors: Luis Alfonso Menéndez García, Fernando Sánchez Lasheras, Paulino José García Nieto, Laura Álvarez de Prado, Antonio Bernardo Sánchez
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/8/12/2205
Description
Summary:Benzene is a pollutant which is very harmful to our health, so models are necessary to predict its concentration and relationship with other air pollutants. The data collected by eight stations in Madrid (Spain) over nine years were analyzed using the following regression-based machine learning models: multivariate linear regression (MLR), multivariate adaptive regression splines (MARS), multilayer perceptron neural network (MLP), support vector machines (SVM), autoregressive integrated moving-average (ARIMA) and vector autoregressive moving-average (VARMA) models. Benzene concentration predictions were made from the concentration of four environmental pollutants: nitrogen dioxide (NO<sub>2</sub>), nitrogen oxides (NO<sub>x</sub>), particulate matter (PM<sub>10</sub>) and toluene (C<sub>7</sub>H<sub>8</sub>), and the performance measures of the model were studied from the proposed models. In general, regression-based machine learning models are more effective at predicting than time series models.
ISSN:2227-7390