Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile
As the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-02-01
|
Series: | Forecasting |
Subjects: | |
Online Access: | https://www.mdpi.com/2571-9394/4/1/13 |
_version_ | 1797471568648994816 |
---|---|
author | Eoin Cartwright Martin Crane Heather J. Ruskin |
author_facet | Eoin Cartwright Martin Crane Heather J. Ruskin |
author_sort | Eoin Cartwright |
collection | DOAJ |
description | As the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction that both sides of an identified sub-sequence pair are of equal length. In this article, motivated by a better localised representation of variations in time series, a novel approach to the identification of motifs is discussed, which allows for some flexibility in side-length. The advantages of this flexibility include improved recognition of localised similar behaviour (manifested as <i>motif shape</i>) over varying timescales. As well as facilitating improved interpretation of localised volatility patterns and a visual comparison of relative volatility levels of series at a globalised level. The process described extends and modifies established techniques, namely <i>SAX</i>, <i>MDL</i> and the Matrix Profile, allowing advantageous properties of leading algorithms for data analysis and dimensionality reduction to be incorporated and future-proofed. Although this technique is potentially applicable to any time series analysis, the focus here is financial and energy sector applications where real-world examples examining <i>S&P500</i> and <i>Open Power System Data</i> are also provided for illustration. |
first_indexed | 2024-03-09T19:50:38Z |
format | Article |
id | doaj.art-353c4e870d0d44869c90071ce7d9183f |
institution | Directory Open Access Journal |
issn | 2571-9394 |
language | English |
last_indexed | 2024-03-09T19:50:38Z |
publishDate | 2022-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Forecasting |
spelling | doaj.art-353c4e870d0d44869c90071ce7d9183f2023-11-24T01:11:49ZengMDPI AGForecasting2571-93942022-02-014121923710.3390/forecast4010013Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix ProfileEoin Cartwright0Martin Crane1Heather J. Ruskin2Modelling & Scientific Computing Group (ModSci), School of Computing, Dublin City University, D09Y074 Dublin, IrelandADAPT Centre, School of Computing, Dublin City University, D09Y074 Dublin, IrelandModelling & Scientific Computing Group (ModSci), School of Computing, Dublin City University, D09Y074 Dublin, IrelandAs the availability of big data-sets becomes more widespread so the importance of motif (or repeated pattern) identification and analysis increases. To date, the majority of motif identification algorithms that permit flexibility of sub-sequence length do so over a given range, with the restriction that both sides of an identified sub-sequence pair are of equal length. In this article, motivated by a better localised representation of variations in time series, a novel approach to the identification of motifs is discussed, which allows for some flexibility in side-length. The advantages of this flexibility include improved recognition of localised similar behaviour (manifested as <i>motif shape</i>) over varying timescales. As well as facilitating improved interpretation of localised volatility patterns and a visual comparison of relative volatility levels of series at a globalised level. The process described extends and modifies established techniques, namely <i>SAX</i>, <i>MDL</i> and the Matrix Profile, allowing advantageous properties of leading algorithms for data analysis and dimensionality reduction to be incorporated and future-proofed. Although this technique is potentially applicable to any time series analysis, the focus here is financial and energy sector applications where real-world examples examining <i>S&P500</i> and <i>Open Power System Data</i> are also provided for illustration.https://www.mdpi.com/2571-9394/4/1/13financial time seriesmatrix profilesymbolic aggregate approximation (<i>SAX</i>)minimum description length (<i>MDL</i>)time series motifs |
spellingShingle | Eoin Cartwright Martin Crane Heather J. Ruskin Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile Forecasting financial time series matrix profile symbolic aggregate approximation (<i>SAX</i>) minimum description length (<i>MDL</i>) time series motifs |
title | Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile |
title_full | Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile |
title_fullStr | Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile |
title_full_unstemmed | Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile |
title_short | Side-Length-Independent Motif (<i>SLIM</i>): Motif Discovery and Volatility Analysis in Time Series—<i>SAX</i>, <i>MDL</i> and the Matrix Profile |
title_sort | side length independent motif i slim i motif discovery and volatility analysis in time series i sax i i mdl i and the matrix profile |
topic | financial time series matrix profile symbolic aggregate approximation (<i>SAX</i>) minimum description length (<i>MDL</i>) time series motifs |
url | https://www.mdpi.com/2571-9394/4/1/13 |
work_keys_str_mv | AT eoincartwright sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile AT martincrane sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile AT heatherjruskin sidelengthindependentmotifislimimotifdiscoveryandvolatilityanalysisintimeseriesisaxiimdliandthematrixprofile |