Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network

It is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation effici...

Full description

Bibliographic Details
Main Authors: Seulbi Lee, Jaehoon Kim, Jongyeon Hwang, EunJi Lee, Kyoung-Jin Lee, Jeongkyu Oh, Jungsu Park, Tae-Young Heo
Format: Article
Language:English
Published: MDPI AG 2020-08-01
Series:Water
Subjects:
Online Access:https://www.mdpi.com/2073-4441/12/9/2411
_version_ 1797555358912217088
author Seulbi Lee
Jaehoon Kim
Jongyeon Hwang
EunJi Lee
Kyoung-Jin Lee
Jeongkyu Oh
Jungsu Park
Tae-Young Heo
author_facet Seulbi Lee
Jaehoon Kim
Jongyeon Hwang
EunJi Lee
Kyoung-Jin Lee
Jeongkyu Oh
Jungsu Park
Tae-Young Heo
author_sort Seulbi Lee
collection DOAJ
description It is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation efficiency. To do this, cluster analysis based on calculated similarity between measuring stations can be used. In this study, we measured the similarities between 12 water quality monitoring stations of the Bukhan River. River water quality data always have a station-dependent time lag because water flows from upstream to downstream; therefore, we proposed a Dynamic Time Warping (DTW) algorithm that searches for the minimum distance by changing and comparing time-points, rather than using the Euclidean algorithm, which compares the same time-point. Both Euclidean and DTW algorithms were applied to nine water quality variables to identify similarities between stations, and K-medoids cluster analysis were performed based on the similarity. The Clustering Validation Index (CVI) was used to select the optimal number of clusters. Our results show that the Euclidean algorithm formed clusters by mixing mainstream and tributary stations; the mainstream stations were largely divided into three different clusters. In contrast, the DTW algorithm formed clear clusters by reflecting the characteristics of water quality and watershed. Furthermore, because the Euclidean algorithm requires the lengths of the time series to be the same, data loss was inevitable. As a result, even where clusters were the same as those obtained by DTW, the characteristics of the water quality variables in the cluster differed. The DTW analysis in this study provides useful information for understanding the similarity or difference in water parameter values between different locations. Thus, the number and location of required monitoring stations can be adjusted to improve the efficiency of field monitoring network management.
first_indexed 2024-03-10T16:45:23Z
format Article
id doaj.art-ecd4058d13cd4b9f98d74dd9bf43fbb8
institution Directory Open Access Journal
issn 2073-4441
language English
last_indexed 2024-03-10T16:45:23Z
publishDate 2020-08-01
publisher MDPI AG
record_format Article
series Water
spelling doaj.art-ecd4058d13cd4b9f98d74dd9bf43fbb82023-11-20T11:40:01ZengMDPI AGWater2073-44412020-08-01129241110.3390/w12092411Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring NetworkSeulbi Lee0Jaehoon Kim1Jongyeon Hwang2EunJi Lee3Kyoung-Jin Lee4Jeongkyu Oh5Jungsu Park6Tae-Young Heo7Future Strategy Department, Chungbuk Innovation Institute of Science & Technology, Chungbuk 28126, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaEnvironmental Measurement and Analysis Center, National Institute of Environmental Research, Incheon 22689, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaEngineering Division, DongMoon ENT Co., Ltd., Seoul 08377, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaDepartment of Civil and Environmental Engineering, Hanbat National University, Daejeon 34158, KoreaDepartment of Information & Statistics, Chungbuk National University, Chungbuk 28644, KoreaIt is essential to monitor water quality for river water management because river water is used for various purposes and is directly related to the health and safety of a population. Proper network installation and removal is an important part of water quality monitoring and network operation efficiency. To do this, cluster analysis based on calculated similarity between measuring stations can be used. In this study, we measured the similarities between 12 water quality monitoring stations of the Bukhan River. River water quality data always have a station-dependent time lag because water flows from upstream to downstream; therefore, we proposed a Dynamic Time Warping (DTW) algorithm that searches for the minimum distance by changing and comparing time-points, rather than using the Euclidean algorithm, which compares the same time-point. Both Euclidean and DTW algorithms were applied to nine water quality variables to identify similarities between stations, and K-medoids cluster analysis were performed based on the similarity. The Clustering Validation Index (CVI) was used to select the optimal number of clusters. Our results show that the Euclidean algorithm formed clusters by mixing mainstream and tributary stations; the mainstream stations were largely divided into three different clusters. In contrast, the DTW algorithm formed clear clusters by reflecting the characteristics of water quality and watershed. Furthermore, because the Euclidean algorithm requires the lengths of the time series to be the same, data loss was inevitable. As a result, even where clusters were the same as those obtained by DTW, the characteristics of the water quality variables in the cluster differed. The DTW analysis in this study provides useful information for understanding the similarity or difference in water parameter values between different locations. Thus, the number and location of required monitoring stations can be adjusted to improve the efficiency of field monitoring network management.https://www.mdpi.com/2073-4441/12/9/2411dynamic time warpingwater quality network optimizationcluster analysisriver water systemwater quality characteristics
spellingShingle Seulbi Lee
Jaehoon Kim
Jongyeon Hwang
EunJi Lee
Kyoung-Jin Lee
Jeongkyu Oh
Jungsu Park
Tae-Young Heo
Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
Water
dynamic time warping
water quality network optimization
cluster analysis
river water system
water quality characteristics
title Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
title_full Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
title_fullStr Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
title_full_unstemmed Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
title_short Clustering of Time Series Water Quality Data Using Dynamic Time Warping: A Case Study from the Bukhan River Water Quality Monitoring Network
title_sort clustering of time series water quality data using dynamic time warping a case study from the bukhan river water quality monitoring network
topic dynamic time warping
water quality network optimization
cluster analysis
river water system
water quality characteristics
url https://www.mdpi.com/2073-4441/12/9/2411
work_keys_str_mv AT seulbilee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT jaehoonkim clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT jongyeonhwang clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT eunjilee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT kyoungjinlee clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT jeongkyuoh clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT jungsupark clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork
AT taeyoungheo clusteringoftimeserieswaterqualitydatausingdynamictimewarpingacasestudyfromthebukhanriverwaterqualitymonitoringnetwork