Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining

Massive bike-sharing systems (BSS) usage and performance data have been collected for years over various locations. Nevertheless, researchers encountered several challenges while dealing with massive BSS data. The challenges that could be enhanced in the previous studies are 1) reducing high dimensi...

Full description

Bibliographic Details
Main Authors: Duo Li, Yifei Zhao, Yan Li
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8928512/
_version_ 1818557551946498048
author Duo Li
Yifei Zhao
Yan Li
author_facet Duo Li
Yifei Zhao
Yan Li
author_sort Duo Li
collection DOAJ
description Massive bike-sharing systems (BSS) usage and performance data have been collected for years over various locations. Nevertheless, researchers encountered several challenges while dealing with massive BSS data. The challenges that could be enhanced in the previous studies are 1) reducing high dimensionality and noise of BSS time series data and 2) extracting informative usage patterns out of massive BSS data. This paper extracts patterns and reduce data dimensions of BSS usage by exploring time series representation and clustering of BSS usage data. A reduced dimension allows us to efficiently approximate the BSS usage with reasonable accuracy, which can be further used for bike usage clustering, classification and prediction. We employ a non-data adaptive representation technique -Discrete Wavelet Transform (DWT) to reduce dimensionality and filter out random errors of the raw time series. Time series are clustered using k-means based on similarities measured by Dynamic Time Warping (DTW) and prototypes computed using DTW barycenter averaging (DBA). The proposed approaches are applied on a 3-month bike usage dataset acquired on the BSS of Chicago. The analysis results show that DWT can effectively reduce dimensionality, filter out random errors and reveal the main characteristics of the raw time series. The clustering approach offers the ability to differentiate and discover bike usage patterns across different stations.
first_indexed 2024-12-14T00:01:00Z
format Article
id doaj.art-6f5ba7d57505454fb00e006e83001129
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-14T00:01:00Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6f5ba7d57505454fb00e006e830011292022-12-21T23:26:20ZengIEEEIEEE Access2169-35362019-01-01717785617786310.1109/ACCESS.2019.29583788928512Time-Series Representation and Clustering Approaches for Sharing Bike Usage MiningDuo Li0https://orcid.org/0000-0003-0142-9290Yifei Zhao1https://orcid.org/0000-0002-2553-9235Yan Li2https://orcid.org/0000-0002-1688-6067School of Highway, Chang’an University, Xi’an, ChinaSchool of Highway, Chang’an University, Xi’an, ChinaSchool of Highway, Chang’an University, Xi’an, ChinaMassive bike-sharing systems (BSS) usage and performance data have been collected for years over various locations. Nevertheless, researchers encountered several challenges while dealing with massive BSS data. The challenges that could be enhanced in the previous studies are 1) reducing high dimensionality and noise of BSS time series data and 2) extracting informative usage patterns out of massive BSS data. This paper extracts patterns and reduce data dimensions of BSS usage by exploring time series representation and clustering of BSS usage data. A reduced dimension allows us to efficiently approximate the BSS usage with reasonable accuracy, which can be further used for bike usage clustering, classification and prediction. We employ a non-data adaptive representation technique -Discrete Wavelet Transform (DWT) to reduce dimensionality and filter out random errors of the raw time series. Time series are clustered using k-means based on similarities measured by Dynamic Time Warping (DTW) and prototypes computed using DTW barycenter averaging (DBA). The proposed approaches are applied on a 3-month bike usage dataset acquired on the BSS of Chicago. The analysis results show that DWT can effectively reduce dimensionality, filter out random errors and reveal the main characteristics of the raw time series. The clustering approach offers the ability to differentiate and discover bike usage patterns across different stations.https://ieeexplore.ieee.org/document/8928512/Sharing bike systemtime series data miningdynamic time warping (DWT)DTW barycenter averaging (DBA)
spellingShingle Duo Li
Yifei Zhao
Yan Li
Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
IEEE Access
Sharing bike system
time series data mining
dynamic time warping (DWT)
DTW barycenter averaging (DBA)
title Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
title_full Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
title_fullStr Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
title_full_unstemmed Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
title_short Time-Series Representation and Clustering Approaches for Sharing Bike Usage Mining
title_sort time series representation and clustering approaches for sharing bike usage mining
topic Sharing bike system
time series data mining
dynamic time warping (DWT)
DTW barycenter averaging (DBA)
url https://ieeexplore.ieee.org/document/8928512/
work_keys_str_mv AT duoli timeseriesrepresentationandclusteringapproachesforsharingbikeusagemining
AT yifeizhao timeseriesrepresentationandclusteringapproachesforsharingbikeusagemining
AT yanli timeseriesrepresentationandclusteringapproachesforsharingbikeusagemining