Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
It is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation effici...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-08-01
|
Series: | Water |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-4441/12/9/2411 |
_version_ | 1797555358912217088 |
---|---|
author | Seulbi Lee Jaehoon Kim Jongyeon Hwang EunJi Lee Kyoung-Jin Lee Jeongkyu Oh Jungsu Park Tae-Young Heo |
author_facet | Seulbi Lee Jaehoon Kim Jongyeon Hwang EunJi Lee Kyoung-Jin Lee Jeongkyu Oh Jungsu Park Tae-Young Heo |
author_sort | Seulbi Lee |
collection | DOAJ |
description | It is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation efficiency. To do this, cluster analysis based on calculated similarity between measuring stations can be used. In this study, we measured the similarities between 12 water quality monitoring stations of the Bukhan River. River water quality data always have a station-dependent time lag because water flows from upstream to downstream; therefore, we proposed a Dynamic Time Warping (DTW) algorithm that searches for the minimum distance by changing and comparing time-points, rather than using the Euclidean algorithm, which compares the same time-point. Both Euclidean and DTW algorithms were applied to nine water quality variables to identify similarities between stations, and K-medoids cluster analysis were performed based on the similarity. The Clustering Validation Index (CVI) was used to select the optimal number of clusters. Our results show that the Euclidean algorithm formed clusters by mixing mainstream and tributary stations; the mainstream stations were largely divided into three different clusters. In contrast, the DTW algorithm formed clear clusters by reflecting the characteristics of water quality and watershed. Furthermore, because the Euclidean algorithm requires the lengths of the time series to be the same, data loss was inevitable. As a result, even where clusters were the same as those obtained by DTW, the characteristics of the water quality variables in the cluster differed. The DTW analysis in this study provides useful information for understanding the similarity or difference in water parameter values between different locations. Thus, the number and location of required monitoring stations can be adjusted to improve the efficiency of field monitoring network management. |
first_indexed | 2024-03-10T16:45:23Z |
format | Article |
id | doaj.art-ecd4058d13cd4b9f98d74dd9bf43fbb8 |
institution | Directory Open Access Journal |
issn | 2073-4441 |
language | English |
last_indexed | 2024-03-10T16:45:23Z |
publishDate | 2020-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Water |
spelling | doaj.art-ecd4058d13cd4b9f98d74dd9bf43fbb82023-11-20T11:40:01ZengMDPI AGWater2073-44412020-08-01129241110.3390/w12092411Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring NetworkSeulbi Lee0Jaehoon Kim1Jongyeon Hwang2EunJi Lee3Kyoung-Jin Lee4Jeongkyu Oh5Jungsu Park6Tae-Young Heo7Future Strategy Department, Chungbuk Innovation Institute of Science & Technology, Chungbuk 28126, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaEnvironmental Measurement and Analysis Center, National Institute of Environmental Research, Incheon 22689, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaEngineering Division, DongMoon ENT Co., Ltd., Seoul 08377, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaDepartment of Civil and Environmental Engineering, Hanbat National University, Daejeon 34158, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaIt is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation efficiency. To do this, cluster analysis based on calculated similarity between measuring stations can be used. In this study, we measured the similarities between 12 water quality monitoring stations of the Bukhan River. River water quality data always have a station-dependent time lag because water flows from upstream to downstream; therefore, we proposed a Dynamic Time Warping (DTW) algorithm that searches for the minimum distance by changing and comparing time-points, rather than using the Euclidean algorithm, which compares the same time-point. Both Euclidean and DTW algorithms were applied to nine water quality variables to identify similarities between stations, and K-medoids cluster analysis were performed based on the similarity. The Clustering Validation Index (CVI) was used to select the optimal number of clusters. Our results show that the Euclidean algorithm formed clusters by mixing mainstream and tributary stations; the mainstream stations were largely divided into three different clusters. In contrast, the DTW algorithm formed clear clusters by reflecting the characteristics of water quality and watershed. Furthermore, because the Euclidean algorithm requires the lengths of the time series to be the same, data loss was inevitable. As a result, even where clusters were the same as those obtained by DTW, the characteristics of the water quality variables in the cluster differed. The DTW analysis in this study provides useful information for understanding the similarity or difference in water parameter values between different locations. Thus, the number and location of required monitoring stations can be adjusted to improve the efficiency of field monitoring network management.https://www.mdpi.com/2073-4441/12/9/2411dynamic time warpingwater quality network optimizationcluster analysisriver water systemwater quality characteristics |
spellingShingle | Seulbi Lee Jaehoon Kim Jongyeon Hwang EunJi Lee Kyoung-Jin Lee Jeongkyu Oh Jungsu Park Tae-Young Heo Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network Water dynamic time warping water quality network optimization cluster analysis river water system water quality characteristics |
title | Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network |
title_full | Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network |
title_fullStr | Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network |
title_full_unstemmed | Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network |
title_short | Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network |
title_sort | clustering of time series water quality data using dynamic time warping a case study from the bukhan river water quality monitoring network |
topic | dynamic time warping water quality network optimization cluster analysis river water system water quality characteristics |
url | https://www.mdpi.com/2073-4441/12/9/2411 |
work_keys_str_mv | AT seulbilee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT jaehoonkim clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT jongyeonhwang clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT eunjilee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT kyoungjinlee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT jeongkyuoh clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT jungsupark clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork AT taeyoungheo clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork |