Anomaly Detection and Repairing for Improving Air Quality Monitoring

Clean air in cities improves our health and overall quality of life and helps fight climate change and preserve our environment. High-resolution measures of pollutants’ concentrations can support the identification of urban areas with poor air quality and raise citizens’ awareness while encouraging...

Full description

Bibliographic Details
Main Authors: Federica Rollo, Chiara Bachechi, Laura Po
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/2/640
_version_ 1797437380817321984
author Federica Rollo
Chiara Bachechi
Laura Po
author_facet Federica Rollo
Chiara Bachechi
Laura Po
author_sort Federica Rollo
collection DOAJ
description Clean air in cities improves our health and overall quality of life and helps fight climate change and preserve our environment. High-resolution measures of pollutants’ concentrations can support the identification of urban areas with poor air quality and raise citizens’ awareness while encouraging more sustainable behaviors. Recent advances in Internet of Things (IoT) technology have led to extensive use of low-cost air quality sensors for hyper-local air quality monitoring. As a result, public administrations and citizens increasingly rely on information obtained from sensors to make decisions in their daily lives and mitigate pollution effects. Unfortunately, in most sensing applications, sensors are known to be error-prone. Thanks to Artificial Intelligence (AI) technologies, it is possible to devise computationally efficient methods that can automatically pinpoint anomalies in those data streams in real time. In order to enhance the reliability of air quality sensing applications, we believe that it is highly important to set up a data-cleaning process. In this work, we propose AIrSense, a novel AI-based framework for obtaining reliable pollutant concentrations from raw data collected by a network of low-cost sensors. It enacts an anomaly detection and repairing procedure on raw measurements before applying the calibration model, which converts raw measurements to concentration measurements of gasses. There are very few studies of anomaly detection in raw air quality sensor data (millivolts). Our approach is the first that proposes to detect and repair anomalies in raw data before they are calibrated by considering the temporal sequence of the measurements and the correlations between different sensor features. If at least some previous measurements are available and not anomalous, it trains a model and uses the prediction to repair the observations; otherwise, it exploits the previous observation. Firstly, a majority voting system based on three different algorithms detects anomalies in raw data. Then, anomalies are repaired to avoid missing values in the measurement time series. In the end, the calibration model provides the pollutant concentrations. Experiments conducted on a real dataset of 12,000 observations produced by 12 low-cost sensors demonstrated the importance of the data-cleaning process in improving calibration algorithms’ performances.
first_indexed 2024-03-09T11:18:27Z
format Article
id doaj.art-a1644a5be6e7416db8f17771fd7c5055
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T11:18:27Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-a1644a5be6e7416db8f17771fd7c50552023-12-01T00:25:05ZengMDPI AGSensors1424-82202023-01-0123264010.3390/s23020640Anomaly Detection and Repairing for Improving Air Quality MonitoringFederica Rollo0Chiara Bachechi1Laura Po2“Enzo Ferrari” Engineering Department, University of Modena and Reggio Emilia, 41121 Modena, Italy“Enzo Ferrari” Engineering Department, University of Modena and Reggio Emilia, 41121 Modena, Italy“Enzo Ferrari” Engineering Department, University of Modena and Reggio Emilia, 41121 Modena, ItalyClean air in cities improves our health and overall quality of life and helps fight climate change and preserve our environment. High-resolution measures of pollutants’ concentrations can support the identification of urban areas with poor air quality and raise citizens’ awareness while encouraging more sustainable behaviors. Recent advances in Internet of Things (IoT) technology have led to extensive use of low-cost air quality sensors for hyper-local air quality monitoring. As a result, public administrations and citizens increasingly rely on information obtained from sensors to make decisions in their daily lives and mitigate pollution effects. Unfortunately, in most sensing applications, sensors are known to be error-prone. Thanks to Artificial Intelligence (AI) technologies, it is possible to devise computationally efficient methods that can automatically pinpoint anomalies in those data streams in real time. In order to enhance the reliability of air quality sensing applications, we believe that it is highly important to set up a data-cleaning process. In this work, we propose AIrSense, a novel AI-based framework for obtaining reliable pollutant concentrations from raw data collected by a network of low-cost sensors. It enacts an anomaly detection and repairing procedure on raw measurements before applying the calibration model, which converts raw measurements to concentration measurements of gasses. There are very few studies of anomaly detection in raw air quality sensor data (millivolts). Our approach is the first that proposes to detect and repair anomalies in raw data before they are calibrated by considering the temporal sequence of the measurements and the correlations between different sensor features. If at least some previous measurements are available and not anomalous, it trains a model and uses the prediction to repair the observations; otherwise, it exploits the previous observation. Firstly, a majority voting system based on three different algorithms detects anomalies in raw data. Then, anomalies are repaired to avoid missing values in the measurement time series. In the end, the calibration model provides the pollutant concentrations. Experiments conducted on a real dataset of 12,000 observations produced by 12 low-cost sensors demonstrated the importance of the data-cleaning process in improving calibration algorithms’ performances.https://www.mdpi.com/1424-8220/23/2/640low-cost sensorsair quality sensorsair quality monitoringanomaly detectionanomaly repairingmultivariate time series
spellingShingle Federica Rollo
Chiara Bachechi
Laura Po
Anomaly Detection and Repairing for Improving Air Quality Monitoring
Sensors
low-cost sensors
air quality sensors
air quality monitoring
anomaly detection
anomaly repairing
multivariate time series
title Anomaly Detection and Repairing for Improving Air Quality Monitoring
title_full Anomaly Detection and Repairing for Improving Air Quality Monitoring
title_fullStr Anomaly Detection and Repairing for Improving Air Quality Monitoring
title_full_unstemmed Anomaly Detection and Repairing for Improving Air Quality Monitoring
title_short Anomaly Detection and Repairing for Improving Air Quality Monitoring
title_sort anomaly detection and repairing for improving air quality monitoring
topic low-cost sensors
air quality sensors
air quality monitoring
anomaly detection
anomaly repairing
multivariate time series
url https://www.mdpi.com/1424-8220/23/2/640
work_keys_str_mv AT federicarollo anomalydetectionandrepairingforimprovingairqualitymonitoring
AT chiarabachechi anomalydetectionandrepairingforimprovingairqualitymonitoring
AT laurapo anomalydetectionandrepairingforimprovingairqualitymonitoring