Multivariate Time Series Density Clustering Algorithm Using Shapelet Space

Multivariate time series clustering has become an important research topic in the task of time series analysis. Compared with univariate time series, the research of multivariate time series is more complex and difficult. Although many clustering algorithms for multivariate time series have been pro...

Full description

Bibliographic Details
Main Author: SHENG Jinchao, DU Mingjing, SUN Jiarui, LI Yurui
Format: Article
Language:zho
Published: Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press 2024-02-01
Series:Jisuanji kexue yu tansuo
Subjects:
Online Access:http://fcst.ceaj.org/fileup/1673-9418/PDF/2211099.pdf
Description
Summary:Multivariate time series clustering has become an important research topic in the task of time series analysis. Compared with univariate time series, the research of multivariate time series is more complex and difficult. Although many clustering algorithms for multivariate time series have been proposed, these algorithms still have difficulties in solving the accuracy and interpretation at the same time. Firstly, most of the current work does not consider the length redundancy and variable correlation of multivariable time series, resulting in large errors in the final similarity matrix. Secondly, the data are commonly used in the clustering process with the division paradigm, when the numerical space presents a complex distribution, this idea does not perform well, and it does not have the explanatory power of each variable and space. To address the above problems, this paper proposes a multivariate time series adaptive weight density clustering algorithm using Shapelet (high information-rich continuous subsequence) space (MDCS). This algorithm firstly performs a Shapelet search for each variable, and obtains its own Shapelet space through an adaptive strategy. Then, it weights the numerical distribution generated by each variable to obtain a similarity matrix that is more consistent with the characteristics of data distribution. Finally, the data are finally allocated using the shared nearest neighbor density peak clustering algorithm with improved density calculation and secondary allocation. Experimental results on several real datasets demonstrate that MDCS has better clustering results compared with current state-of-the-art clustering algorithms, with an average increase of 0.344 and 0.09 in the normalized mutual information and Rand index, balancing performance and interpretability.
ISSN:1673-9418