Change Point Detection in Time Series via Multivariate Singular Spectrum Analysis

The objective of change-point detection (CPD) is to estimate the time of significant and abrupt changes in the dynamics of a system through multivariate time series observations. The setup of CPD covers a wide range of real-world problems such as quality control, medical diagnosis, speech recognitio...

Full description

Bibliographic Details
Main Author: AlAnqary, Arwa
Other Authors: Shah, Devavrat
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139610
Description
Summary:The objective of change-point detection (CPD) is to estimate the time of significant and abrupt changes in the dynamics of a system through multivariate time series observations. The setup of CPD covers a wide range of real-world problems such as quality control, medical diagnosis, speech recognition, and fraud detection to name a few. In this thesis, we develop and analyze a principled method for CPD that combines a variant of multivariate singular spectrum analysis (mSSA) approach with the cumulative sum (CUSUM) procedure for sequential hypothesis testing. In particular, we model the underlying dynamics of multivariate time series observations through the spatio-temporal model introduced recently in the mSSA literature. The change points in such a setting correspond to a change in the underlying spatio-temporal model. As the primary contributions of this work, we develop a CUSUM-based algorithm to detect such change points in an online fashion. Further, we extend the analysis of CUSUM statistics, traditionally done for the setting of independent observations, to the dependent setting of (multivariate) time series under the spatiotemporal factor model. Specifically, we analyze the performance of our algorithm in terms of the average running length (ARL) – a common metric used traditionally in sequential hypothesis testing to measure the trade-off between the delay in a true detection and the running time until a false detection. We formally establish that for any given detection parameter h > 0, on average, the algorithm detects a change point with a delay of 𝑂(h) time steps, while in the case of no change it takes at least Ω(exp(h)) time steps until it makes a false detection. Finally, we empirically show that the proposed CPD method provides state-of-the-art performance across synthetic and benchmark datasets.