Trip Purpose Imputation Using GPS Trajectories with Machine Learning

We studied trip purpose imputation using data mining and machine learning techniques based on a dataset of GPS-based trajectories gathered in Switzerland. With a large number of labeled activities in eight categories, we explored location information using hierarchical clustering and achieved a clas...

Full description

Bibliographic Details
Main Authors: Qinggang Gao, Joseph Molloy, Kay W. Axhausen
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/10/11/775
_version_ 1797510068666630144
author Qinggang Gao
Joseph Molloy
Kay W. Axhausen
author_facet Qinggang Gao
Joseph Molloy
Kay W. Axhausen
author_sort Qinggang Gao
collection DOAJ
description We studied trip purpose imputation using data mining and machine learning techniques based on a dataset of GPS-based trajectories gathered in Switzerland. With a large number of labeled activities in eight categories, we explored location information using hierarchical clustering and achieved a classification accuracy of 86.7% using a random forest approach as a baseline. The contribution of this study is summarized below. Firstly, using information from GPS trajectories exclusively without personal information shows a negligible decrease in accuracy (0.9%), which indicates the good performance of our data mining steps and the wide applicability of our imputation scheme in case of limited information availability. Secondly, the dependence of model performance on the geographical location, the number of participants, and the duration of the survey is investigated to provide a reference when comparing classification accuracy. Furthermore, we show the ensemble filter to be an excellent tool in this research field not only because of the increased accuracy (93.6%), especially for minority classes, but also the reduced uncertainties in blindly trusting the labeling of activities by participants, which is vulnerable to class noise due to the large survey response burden. Finally, the trip purpose derivation accuracy across participants reaches 74.8%, which is significant and suggests the possibility of effectively applying a model trained on GPS trajectories of a small subset of citizens to a larger GPS trajectory sample.
first_indexed 2024-03-10T05:26:31Z
format Article
id doaj.art-5b45681ef4ff4326aacb528c01cb781d
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-03-10T05:26:31Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-5b45681ef4ff4326aacb528c01cb781d2023-11-22T23:36:40ZengMDPI AGISPRS International Journal of Geo-Information2220-99642021-11-01101177510.3390/ijgi10110775Trip Purpose Imputation Using GPS Trajectories with Machine LearningQinggang Gao0Joseph Molloy1Kay W. Axhausen2Institute for Transport Planning and Systems, ETH Zurich, 8093 Zurich, SwitzerlandInstitute for Transport Planning and Systems, ETH Zurich, 8093 Zurich, SwitzerlandInstitute for Transport Planning and Systems, ETH Zurich, 8093 Zurich, SwitzerlandWe studied trip purpose imputation using data mining and machine learning techniques based on a dataset of GPS-based trajectories gathered in Switzerland. With a large number of labeled activities in eight categories, we explored location information using hierarchical clustering and achieved a classification accuracy of 86.7% using a random forest approach as a baseline. The contribution of this study is summarized below. Firstly, using information from GPS trajectories exclusively without personal information shows a negligible decrease in accuracy (0.9%), which indicates the good performance of our data mining steps and the wide applicability of our imputation scheme in case of limited information availability. Secondly, the dependence of model performance on the geographical location, the number of participants, and the duration of the survey is investigated to provide a reference when comparing classification accuracy. Furthermore, we show the ensemble filter to be an excellent tool in this research field not only because of the increased accuracy (93.6%), especially for minority classes, but also the reduced uncertainties in blindly trusting the labeling of activities by participants, which is vulnerable to class noise due to the large survey response burden. Finally, the trip purpose derivation accuracy across participants reaches 74.8%, which is significant and suggests the possibility of effectively applying a model trained on GPS trajectories of a small subset of citizens to a larger GPS trajectory sample.https://www.mdpi.com/2220-9964/10/11/775class noisedata miningensemble filterhierarchical clusteringmachine learningrandom forest
spellingShingle Qinggang Gao
Joseph Molloy
Kay W. Axhausen
Trip Purpose Imputation Using GPS Trajectories with Machine Learning
ISPRS International Journal of Geo-Information
class noise
data mining
ensemble filter
hierarchical clustering
machine learning
random forest
title Trip Purpose Imputation Using GPS Trajectories with Machine Learning
title_full Trip Purpose Imputation Using GPS Trajectories with Machine Learning
title_fullStr Trip Purpose Imputation Using GPS Trajectories with Machine Learning
title_full_unstemmed Trip Purpose Imputation Using GPS Trajectories with Machine Learning
title_short Trip Purpose Imputation Using GPS Trajectories with Machine Learning
title_sort trip purpose imputation using gps trajectories with machine learning
topic class noise
data mining
ensemble filter
hierarchical clustering
machine learning
random forest
url https://www.mdpi.com/2220-9964/10/11/775
work_keys_str_mv AT qingganggao trippurposeimputationusinggpstrajectorieswithmachinelearning
AT josephmolloy trippurposeimputationusinggpstrajectorieswithmachinelearning
AT kaywaxhausen trippurposeimputationusinggpstrajectorieswithmachinelearning