Technical note: A procedure to clean, decompose, and aggregate time series

<p>Errors, gaps, and outliers complicate and sometimes invalidate the analysis of time series. While most fields have developed their own strategy to clean the raw data, no generic procedure has been promoted to standardize the pre-processing. This lack of harmonization makes the inter-compari...

Full description

Bibliographic Details
Main Author: F. Ritter
Format: Article
Language:English
Published: Copernicus Publications 2023-01-01
Series:Hydrology and Earth System Sciences
Online Access:https://hess.copernicus.org/articles/27/349/2023/hess-27-349-2023.pdf
_version_ 1797948996621172736
author F. Ritter
author_facet F. Ritter
author_sort F. Ritter
collection DOAJ
description <p>Errors, gaps, and outliers complicate and sometimes invalidate the analysis of time series. While most fields have developed their own strategy to clean the raw data, no generic procedure has been promoted to standardize the pre-processing. This lack of harmonization makes the inter-comparison of studies difficult, and leads to screening methods that can be arbitrary or case-specific. This study provides a generic pre-processing procedure implemented in R (ctbi for cyclic/trend decomposition using bin interpolation) dedicated to univariate time series. Ctbi is based on data binning and decomposes the time series into a long-term trend and a cyclic component (quantified by a new metric, the Stacked Cycles Index) to finally aggregate the data. Outliers are flagged with an enhanced box plot rule called Logbox that corrects biases due to the sample size and that is adapted to non-Gaussian residuals. Three different Earth science datasets (contaminated with gaps and outliers) are successfully cleaned and aggregated with ctbi. This illustrates the robustness of this procedure that can be valuable to any discipline.</p>
first_indexed 2024-04-10T21:52:21Z
format Article
id doaj.art-87f5320c5f52478f91775802bd2e8caf
institution Directory Open Access Journal
issn 1027-5606
1607-7938
language English
last_indexed 2024-04-10T21:52:21Z
publishDate 2023-01-01
publisher Copernicus Publications
record_format Article
series Hydrology and Earth System Sciences
spelling doaj.art-87f5320c5f52478f91775802bd2e8caf2023-01-18T11:45:07ZengCopernicus PublicationsHydrology and Earth System Sciences1027-56061607-79382023-01-012734936110.5194/hess-27-349-2023Technical note: A procedure to clean, decompose, and aggregate time seriesF. Ritter<p>Errors, gaps, and outliers complicate and sometimes invalidate the analysis of time series. While most fields have developed their own strategy to clean the raw data, no generic procedure has been promoted to standardize the pre-processing. This lack of harmonization makes the inter-comparison of studies difficult, and leads to screening methods that can be arbitrary or case-specific. This study provides a generic pre-processing procedure implemented in R (ctbi for cyclic/trend decomposition using bin interpolation) dedicated to univariate time series. Ctbi is based on data binning and decomposes the time series into a long-term trend and a cyclic component (quantified by a new metric, the Stacked Cycles Index) to finally aggregate the data. Outliers are flagged with an enhanced box plot rule called Logbox that corrects biases due to the sample size and that is adapted to non-Gaussian residuals. Three different Earth science datasets (contaminated with gaps and outliers) are successfully cleaned and aggregated with ctbi. This illustrates the robustness of this procedure that can be valuable to any discipline.</p>https://hess.copernicus.org/articles/27/349/2023/hess-27-349-2023.pdf
spellingShingle F. Ritter
Technical note: A procedure to clean, decompose, and aggregate time series
Hydrology and Earth System Sciences
title Technical note: A procedure to clean, decompose, and aggregate time series
title_full Technical note: A procedure to clean, decompose, and aggregate time series
title_fullStr Technical note: A procedure to clean, decompose, and aggregate time series
title_full_unstemmed Technical note: A procedure to clean, decompose, and aggregate time series
title_short Technical note: A procedure to clean, decompose, and aggregate time series
title_sort technical note a procedure to clean decompose and aggregate time series
url https://hess.copernicus.org/articles/27/349/2023/hess-27-349-2023.pdf
work_keys_str_mv AT fritter technicalnoteaproceduretocleandecomposeandaggregatetimeseries