Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring
To better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/3/1061 |
_version_ | 1797623332844077056 |
---|---|
author | Thulane Paepae Pitshou N. Bokoro Kyandoghere Kyamakya |
author_facet | Thulane Paepae Pitshou N. Bokoro Kyandoghere Kyamakya |
author_sort | Thulane Paepae |
collection | DOAJ |
description | To better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables as inputs. While the predictive performance of these data-driven, virtual-sensor models depends on the use of adequate training samples (in quality and quantity), the procurement and operational cost of nitrogen and phosphorus sensors make it impractical to acquire sufficient samples. For this reason, the variational autoencoder, which is one of the most prominent methods in generative models, was utilized in the present work for generating synthetic data. The generation capacity of the model was verified using water-quality data from two tributaries of the River Thames in the United Kingdom. Compared to the current state of the art, our novel data augmentation—including proper experimental settings or hyperparameter optimization—improved the root mean squared errors by 23–63%, with the most significant improvements observed when up to three predictors were used. In comparing the predictive algorithms’ performances (in terms of the predictive accuracy and computational cost), k-nearest neighbors and extremely randomized trees were the best-performing algorithms on average. |
first_indexed | 2024-03-11T09:27:16Z |
format | Article |
id | doaj.art-72eb75d9e6ef469091491935c03a850b |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-11T09:27:16Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-72eb75d9e6ef469091491935c03a850b2023-11-16T17:55:34ZengMDPI AGSensors1424-82202023-01-01233106110.3390/s23031061Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus MonitoringThulane Paepae0Pitshou N. Bokoro1Kyandoghere Kyamakya2Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South AfricaDepartment of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South AfricaInstitute for Smart Systems Technologies, Transportation Informatics, Alpen-Adria Universität Klagenfurt, 9020 Klagenfurt, AustriaTo better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables as inputs. While the predictive performance of these data-driven, virtual-sensor models depends on the use of adequate training samples (in quality and quantity), the procurement and operational cost of nitrogen and phosphorus sensors make it impractical to acquire sufficient samples. For this reason, the variational autoencoder, which is one of the most prominent methods in generative models, was utilized in the present work for generating synthetic data. The generation capacity of the model was verified using water-quality data from two tributaries of the River Thames in the United Kingdom. Compared to the current state of the art, our novel data augmentation—including proper experimental settings or hyperparameter optimization—improved the root mean squared errors by 23–63%, with the most significant improvements observed when up to three predictors were used. In comparing the predictive algorithms’ performances (in terms of the predictive accuracy and computational cost), k-nearest neighbors and extremely randomized trees were the best-performing algorithms on average.https://www.mdpi.com/1424-8220/23/3/1061water-quality monitoringeutrophicationsynthetic datasoft sensorsurrogate variablesvariational autoencoder |
spellingShingle | Thulane Paepae Pitshou N. Bokoro Kyandoghere Kyamakya Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring Sensors water-quality monitoring eutrophication synthetic data soft sensor surrogate variables variational autoencoder |
title | Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring |
title_full | Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring |
title_fullStr | Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring |
title_full_unstemmed | Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring |
title_short | Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring |
title_sort | data augmentation for a virtual sensor based nitrogen and phosphorus monitoring |
topic | water-quality monitoring eutrophication synthetic data soft sensor surrogate variables variational autoencoder |
url | https://www.mdpi.com/1424-8220/23/3/1061 |
work_keys_str_mv | AT thulanepaepae dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring AT pitshounbokoro dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring AT kyandogherekyamakya dataaugmentationforavirtualsensorbasednitrogenandphosphorusmonitoring |