On the application of clustering for extracting driving scenarios from vehicle data

If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering...

Full description

Bibliographic Details
Main Authors: Nour Chetouane, Franz Wotawa
Format: Article
Language:English
Published: Elsevier 2022-09-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827022000664
_version_ 1817995017852026880
author Nour Chetouane
Franz Wotawa
author_facet Nour Chetouane
Franz Wotawa
author_sort Nour Chetouane
collection DOAJ
description If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest.
first_indexed 2024-04-14T01:59:43Z
format Article
id doaj.art-1278970bea1644d8a1b9ab31006b70b0
institution Directory Open Access Journal
issn 2666-8270
language English
last_indexed 2024-04-14T01:59:43Z
publishDate 2022-09-01
publisher Elsevier
record_format Article
series Machine Learning with Applications
spelling doaj.art-1278970bea1644d8a1b9ab31006b70b02022-12-22T02:18:52ZengElsevierMachine Learning with Applications2666-82702022-09-019100377On the application of clustering for extracting driving scenarios from vehicle dataNour Chetouane0Franz Wotawa1Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaCorresponding author.; Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaIf we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest.http://www.sciencedirect.com/science/article/pii/S2666827022000664Clustering for information extractionDriving data extractionExperimental evaluationComparison of clustering algorithms
spellingShingle Nour Chetouane
Franz Wotawa
On the application of clustering for extracting driving scenarios from vehicle data
Machine Learning with Applications
Clustering for information extraction
Driving data extraction
Experimental evaluation
Comparison of clustering algorithms
title On the application of clustering for extracting driving scenarios from vehicle data
title_full On the application of clustering for extracting driving scenarios from vehicle data
title_fullStr On the application of clustering for extracting driving scenarios from vehicle data
title_full_unstemmed On the application of clustering for extracting driving scenarios from vehicle data
title_short On the application of clustering for extracting driving scenarios from vehicle data
title_sort on the application of clustering for extracting driving scenarios from vehicle data
topic Clustering for information extraction
Driving data extraction
Experimental evaluation
Comparison of clustering algorithms
url http://www.sciencedirect.com/science/article/pii/S2666827022000664
work_keys_str_mv AT nourchetouane ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata
AT franzwotawa ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata