Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring

To better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables a...

Full description

Bibliographic Details
Main Authors: Thulane Paepae, Pitshou N. Bokoro, Kyandoghere Kyamakya
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/3/1061
_version_ 1797623332844077056
author Thulane Paepae
Pitshou N. Bokoro
Kyandoghere Kyamakya
author_facet Thulane Paepae
Pitshou N. Bokoro
Kyandoghere Kyamakya
author_sort Thulane Paepae
collection DOAJ
description To better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables as inputs. While the predictive performance of these data-driven, virtual-sensor models depends on the use of adequate training samples (in quality and quantity), the procurement and operational cost of nitrogen and phosphorus sensors make it impractical to acquire sufficient samples. For this reason, the variational autoencoder, which is one of the most prominent methods in generative models, was utilized in the present work for generating synthetic data. The generation capacity of the model was verified using water-quality data from two tributaries of the River Thames in the United Kingdom. Compared to the current state of the art, our novel data augmentation—including proper experimental settings or hyperparameter optimization—improved the root mean squared errors by 23–63%, with the most significant improvements observed when up to three predictors were used. In comparing the predictive algorithms’ performances (in terms of the predictive accuracy and computational cost), k-nearest neighbors and extremely randomized trees were the best-performing algorithms on average.
first_indexed 2024-03-11T09:27:16Z
format Article
id doaj.art-72eb75d9e6ef469091491935c03a850b
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-11T09:27:16Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-72eb75d9e6ef469091491935c03a850b2023-11-16T17:55:34ZengMDPI AGSensors1424-82202023-01-01233106110.3390/s23031061Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus MonitoringThulane Paepae0Pitshou N. Bokoro1Kyandoghere Kyamakya2Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South AfricaDepartment of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South AfricaInstitute for Smart Systems Technologies, Transportation Informatics, Alpen-Adria Universität Klagenfurt, 9020 Klagenfurt, AustriaTo better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables as inputs. While the predictive performance of these data-driven, virtual-sensor models depends on the use of adequate training samples (in quality and quantity), the procurement and operational cost of nitrogen and phosphorus sensors make it impractical to acquire sufficient samples. For this reason, the variational autoencoder, which is one of the most prominent methods in generative models, was utilized in the present work for generating synthetic data. The generation capacity of the model was verified using water-quality data from two tributaries of the River Thames in the United Kingdom. Compared to the current state of the art, our novel data augmentation—including proper experimental settings or hyperparameter optimization—improved the root mean squared errors by 23–63%, with the most significant improvements observed when up to three predictors were used. In comparing the predictive algorithms’ performances (in terms of the predictive accuracy and computational cost), k-nearest neighbors and extremely randomized trees were the best-performing algorithms on average.https://www.mdpi.com/1424-8220/23/3/1061water-quality monitoringeutrophicationsynthetic datasoft sensorsurrogate variablesvariational autoencoder
spellingShingle Thulane Paepae
Pitshou N. Bokoro
Kyandoghere Kyamakya
Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
Sensors
water-quality monitoring
eutrophication
synthetic data
soft sensor
surrogate variables
variational autoencoder
title Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
title_full Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
title_fullStr Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
title_full_unstemmed Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
title_short Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
title_sort data augmentation for a virtual sensor based nitrogen and phosphorus monitoring
topic water-quality monitoring
eutrophication
synthetic data
soft sensor
surrogate variables
variational autoencoder
url https://www.mdpi.com/1424-8220/23/3/1061
work_keys_str_mv AT thulanepaepae dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring
AT pitshounbokoro dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring
AT kyandogherekyamakya dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring