Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data

Outliers are often present in large datasets of water quality monitoring time series data. A method of combining the sliding window technique with Dixon detection criterion for the automatic detection of outliers in time series data is limited by the empirical determination of sliding window sizes....

Full description

Bibliographic Details
Main Authors: Zhongya Fan, Huiyun Feng, Jingang Jiang, Changjin Zhao, Ni Jiang, Wencai Wang, Fantang Zeng
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/5/1876
_version_ 1828486477321863168
author Zhongya Fan
Huiyun Feng
Jingang Jiang
Changjin Zhao
Ni Jiang
Wencai Wang
Fantang Zeng
author_facet Zhongya Fan
Huiyun Feng
Jingang Jiang
Changjin Zhao
Ni Jiang
Wencai Wang
Fantang Zeng
author_sort Zhongya Fan
collection DOAJ
description Outliers are often present in large datasets of water quality monitoring time series data. A method of combining the sliding window technique with Dixon detection criterion for the automatic detection of outliers in time series data is limited by the empirical determination of sliding window sizes. The scientific determination of the optimal sliding window size is very meaningful research work. This paper presents a new Monte Carlo Search Method (MCSM) based on random sampling to optimize the size of the sliding window, which fully takes advantage of computers and statistics. The MCSM was applied in a case study to automatic monitoring data of water quality factors in order to test its validity and usefulness. The results of comparing the accuracy and efficiency of the MCSM show that the new method in this paper is scientific and effective. The experimental results show that, at different sample sizes, the average accuracy is between 58.70% and 75.75%, and the average computation time increase is between 17.09% and 45.53%. In the era of big data in environmental monitoring, the proposed new methods can meet the required accuracy of outlier detection and improve the efficiency of calculation.
first_indexed 2024-12-11T09:33:43Z
format Article
id doaj.art-fdb843663d4848f5a1870d388d7b4620
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-12-11T09:33:43Z
publishDate 2020-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-fdb843663d4848f5a1870d388d7b46202022-12-22T01:12:57ZengMDPI AGApplied Sciences2076-34172020-03-01105187610.3390/app10051876app10051876Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series DataZhongya Fan0Huiyun Feng1Jingang Jiang2Changjin Zhao3Ni Jiang4Wencai Wang5Fantang Zeng6State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou 510530, ChinaInstitute of Technical Biology & Agriculture Engineering, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, ChinaInstitute of Technical Biology & Agriculture Engineering, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, ChinaState Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou 510530, ChinaState Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou 510530, ChinaState Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou 510530, ChinaState Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou 510530, ChinaOutliers are often present in large datasets of water quality monitoring time series data. A method of combining the sliding window technique with Dixon detection criterion for the automatic detection of outliers in time series data is limited by the empirical determination of sliding window sizes. The scientific determination of the optimal sliding window size is very meaningful research work. This paper presents a new Monte Carlo Search Method (MCSM) based on random sampling to optimize the size of the sliding window, which fully takes advantage of computers and statistics. The MCSM was applied in a case study to automatic monitoring data of water quality factors in order to test its validity and usefulness. The results of comparing the accuracy and efficiency of the MCSM show that the new method in this paper is scientific and effective. The experimental results show that, at different sample sizes, the average accuracy is between 58.70% and 75.75%, and the average computation time increase is between 17.09% and 45.53%. In the era of big data in environmental monitoring, the proposed new methods can meet the required accuracy of outlier detection and improve the efficiency of calculation.https://www.mdpi.com/2076-3417/10/5/1876time series environmental monitoring datamonte carlo optimizationdata quality controlsliding window size
spellingShingle Zhongya Fan
Huiyun Feng
Jingang Jiang
Changjin Zhao
Ni Jiang
Wencai Wang
Fantang Zeng
Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
Applied Sciences
time series environmental monitoring data
monte carlo optimization
data quality control
sliding window size
title Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
title_full Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
title_fullStr Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
title_full_unstemmed Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
title_short Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data
title_sort monte carlo optimization for sliding window size in dixon quality control of environmental monitoring time series data
topic time series environmental monitoring data
monte carlo optimization
data quality control
sliding window size
url https://www.mdpi.com/2076-3417/10/5/1876
work_keys_str_mv AT zhongyafan montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT huiyunfeng montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT jingangjiang montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT changjinzhao montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT nijiang montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT wencaiwang montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata
AT fantangzeng montecarlooptimizationforslidingwindowsizeindixonqualitycontrolofenvironmentalmonitoringtimeseriesdata