Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile

As the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction...

Full description

Bibliographic Details
Main Authors: Eoin Cartwright, Martin Crane, Heather J. Ruskin
Format: Article
Language:English
Published: MDPI AG 2022-02-01
Series:Forecasting
Subjects:
Online Access:https://www.mdpi.com/2571-9394/4/1/13
_version_ 1797471568648994816
author Eoin Cartwright
Martin Crane
Heather J. Ruskin
author_facet Eoin Cartwright
Martin Crane
Heather J. Ruskin
author_sort Eoin Cartwright
collection DOAJ
description As the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction that both sides of an identified sub-sequence pair are of equal length. In this article, motivated by a better localised representation of variations in time series, a novel approach to the identification of motifs is discussed, which allows for some flexibility in side-length. The advantages of this flexibility include improved recognition of localised similar behaviour (manifested as <i>motif shape</i>) over varying timescales. As well as facilitating improved interpretation of localised volatility patterns and a visual comparison of relative volatility levels of series at a globalised level. The process described extends and modifies established techniques, namely <i>SAX</i>, <i>MDL</i> and the Matrix Profile, allowing advantageous properties of leading algorithms for data analysis and dimensionality reduction to be incorporated and future-proofed. Although this technique is potentially applicable to any time series analysis, the focus here is financial and energy sector applications where real-world examples examining <i>S&P500</i> and <i>Open Power System Data</i> are also provided for illustration.
first_indexed 2024-03-09T19:50:38Z
format Article
id doaj.art-353c4e870d0d44869c90071ce7d9183f
institution Directory Open Access Journal
issn 2571-9394
language English
last_indexed 2024-03-09T19:50:38Z
publishDate 2022-02-01
publisher MDPI AG
record_format Article
series Forecasting
spelling doaj.art-353c4e870d0d44869c90071ce7d9183f2023-11-24T01:11:49ZengMDPI AGForecasting2571-93942022-02-014121923710.3390/forecast4010013Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix ProfileEoin Cartwright0Martin Crane1Heather J. Ruskin2Modelling & Scientific Computing Group (ModSci), School of Computing, Dublin City University, D09Y074 Dublin, IrelandADAPT Centre, School of Computing, Dublin City University, D09Y074 Dublin, IrelandModelling & Scientific Computing Group (ModSci), School of Computing, Dublin City University, D09Y074 Dublin, IrelandAs the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction that both sides of an identified sub-sequence pair are of equal length. In this article, motivated by a better localised representation of variations in time series, a novel approach to the identification of motifs is discussed, which allows for some flexibility in side-length. The advantages of this flexibility include improved recognition of localised similar behaviour (manifested as <i>motif shape</i>) over varying timescales. As well as facilitating improved interpretation of localised volatility patterns and a visual comparison of relative volatility levels of series at a globalised level. The process described extends and modifies established techniques, namely <i>SAX</i>, <i>MDL</i> and the Matrix Profile, allowing advantageous properties of leading algorithms for data analysis and dimensionality reduction to be incorporated and future-proofed. Although this technique is potentially applicable to any time series analysis, the focus here is financial and energy sector applications where real-world examples examining <i>S&P500</i> and <i>Open Power System Data</i> are also provided for illustration.https://www.mdpi.com/2571-9394/4/1/13financial time seriesmatrix profilesymbolic aggregate approximation (<i>SAX</i>)minimum description length (<i>MDL</i>)time series motifs
spellingShingle Eoin Cartwright
Martin Crane
Heather J. Ruskin
Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
Forecasting
financial time series
matrix profile
symbolic aggregate approximation (<i>SAX</i>)
minimum description length (<i>MDL</i>)
time series motifs
title Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
title_full Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
title_fullStr Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
title_full_unstemmed Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
title_short Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
title_sort side length independent motif i slim i motif discovery and volatility analysis in time series i sax i i mdl i and the matrix profile
topic financial time series
matrix profile
symbolic aggregate approximation (<i>SAX</i>)
minimum description length (<i>MDL</i>)
time series motifs
url https://www.mdpi.com/2571-9394/4/1/13
work_keys_str_mv AT eoincartwright sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile
AT martincrane sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile
AT heatherjruskin sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile