Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting

The analysis of multivariate time series data is of great interest across many domains, including cyber-physical systems, finance, retail, healthcare to name a few. A common goal across all of these domains is accurate imputation and forecasting of multivariate time series in the presence of noisy a...

Full description

Bibliographic Details
Main Author: Alomar, Abdullah
Other Authors: Shah, Devavrat
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140365
_version_ 1826209632212746240
author Alomar, Abdullah
author2 Shah, Devavrat
author_facet Shah, Devavrat
Alomar, Abdullah
author_sort Alomar, Abdullah
collection MIT
description The analysis of multivariate time series data is of great interest across many domains, including cyber-physical systems, finance, retail, healthcare to name a few. A common goal across all of these domains is accurate imputation and forecasting of multivariate time series in the presence of noisy and/or missing data. Given the growing need to embed predictive functionality in high-performance systems, especially in applications with time series data (e.g., financial systems, control systems), it is increasingly vital that we build principled prediction algorithms that are statistically and computationally performant, and more broadly accessible. To that end, we introduce a novel variant of multivariate Singular Spectrum Analysis (mSSA) that allows for accurate imputation and forecasting of both time-varying mean and variance of multivariate time series. We further justify this algorithm by introducing a natural Spatio-temporal factor model, under which the algorithm is theoretically analyzed; Specifically, We establish the in-sample prediction error of our mSSA variant for both imputation and forecasting. Further, we propose an incremental variant of the algorithm, upon which, a real-time prediction system for time series data, tspDB, is instantiated and evaluated. tspDB aims to increase accessibility to predictive functionalities for time series data through the direct integration with existing relational time series Databases. Finally, through rigorous experiments, we show that tspDB provides state-of-the-art statistical accuracy while maintaining a superior computational performance with an incremental model update, low model training time, and low latency for prediction queries.
first_indexed 2024-09-23T14:25:38Z
format Thesis
id mit-1721.1/140365
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T14:25:38Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1403652022-02-16T03:12:39Z Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting Alomar, Abdullah Shah, Devavrat Marzouk, Youssef Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology. Center for Computational Science and Engineering The analysis of multivariate time series data is of great interest across many domains, including cyber-physical systems, finance, retail, healthcare to name a few. A common goal across all of these domains is accurate imputation and forecasting of multivariate time series in the presence of noisy and/or missing data. Given the growing need to embed predictive functionality in high-performance systems, especially in applications with time series data (e.g., financial systems, control systems), it is increasingly vital that we build principled prediction algorithms that are statistically and computationally performant, and more broadly accessible. To that end, we introduce a novel variant of multivariate Singular Spectrum Analysis (mSSA) that allows for accurate imputation and forecasting of both time-varying mean and variance of multivariate time series. We further justify this algorithm by introducing a natural Spatio-temporal factor model, under which the algorithm is theoretically analyzed; Specifically, We establish the in-sample prediction error of our mSSA variant for both imputation and forecasting. Further, we propose an incremental variant of the algorithm, upon which, a real-time prediction system for time series data, tspDB, is instantiated and evaluated. tspDB aims to increase accessibility to predictive functionalities for time series data through the direct integration with existing relational time series Databases. Finally, through rigorous experiments, we show that tspDB provides state-of-the-art statistical accuracy while maintaining a superior computational performance with an incremental model update, low model training time, and low latency for prediction queries. S.M. S.M. 2022-02-15T17:02:23Z 2022-02-15T17:02:23Z 2021-09 2022-01-19T19:04:54.128Z Thesis https://hdl.handle.net/1721.1/140365 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Alomar, Abdullah
Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title_full Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title_fullStr Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title_full_unstemmed Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title_short Multivariate Singular Spectrum Analysis: A Principled, Practical, and Performant Solution for Time Series Imputation and Forecasting
title_sort multivariate singular spectrum analysis a principled practical and performant solution for time series imputation and forecasting
url https://hdl.handle.net/1721.1/140365
work_keys_str_mv AT alomarabdullah multivariatesingularspectrumanalysisaprincipledpracticalandperformantsolutionfortimeseriesimputationandforecasting