A Comparative Study of Frequent Pattern Mining with Trajectory Data

Sequential pattern mining (SPM) is a major class of data mining topics with a wide range of applications. The continuity and uncertain nature of trajectory data make it distinctively different from typical transactional data, which requires additional data transformation to prepare for SPM. However,...

Full description

Bibliographic Details
Main Authors: Shiting Ding, Zhiheng Li, Kai Zhang, Feng Mao
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/19/7608
Description
Summary:Sequential pattern mining (SPM) is a major class of data mining topics with a wide range of applications. The continuity and uncertain nature of trajectory data make it distinctively different from typical transactional data, which requires additional data transformation to prepare for SPM. However, little research focuses on comparing the performance of SPM algorithms and their applications in the context of trajectory data. This study selected some representative sequential pattern mining algorithms and evaluated them with various parameters to understand the effect of the involved parameters on their performances. We studied the resultant sequential patterns, runtime, and RAM consumption in the context of the taxi trajectory dataset, the T-drive dataset. It was demonstrated in this work that a method to discretize trajectory data and different SPM algorithms were performed on trajectory databases. The results were visualized on actual Beijing road maps, reflecting traffic congestion conditions. Results demonstrated contiguous constraint-based algorithms could provide a concise representation of output sequences and functions at low <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mi>m</mi><mi>i</mi><mi>n</mi><mo>_</mo><mi>s</mi><mi>u</mi><mi>p</mi></mrow></semantics></math></inline-formula> with balanced RAM consumption and execution time. This study can be used as a guide for academics and professionals when determining the most suitable SPM algorithm for applications that involve trajectory data.
ISSN:1424-8220