On the application of clustering for extracting driving scenarios from vehicle data

If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering...

Full description

Bibliographic Details
Main Authors:	Nour Chetouane, Franz Wotawa
Format:	Article
Language:	English
Published:	Elsevier 2022-09-01
Series:	Machine Learning with Applications
Subjects:	Clustering for information extraction Driving data extraction Experimental evaluation Comparison of clustering algorithms
Online Access:	http://www.sciencedirect.com/science/article/pii/S2666827022000664

_version_	1817995017852026880
author	Nour Chetouane Franz Wotawa
author_facet	Nour Chetouane Franz Wotawa
author_sort	Nour Chetouane
collection	DOAJ
description	If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest.
first_indexed	2024-04-14T01:59:43Z
format	Article
id	doaj.art-1278970bea1644d8a1b9ab31006b70b0
institution	Directory Open Access Journal
issn	2666-8270
language	English
last_indexed	2024-04-14T01:59:43Z
publishDate	2022-09-01
publisher	Elsevier
record_format	Article
series	Machine Learning with Applications
spelling	doaj.art-1278970bea1644d8a1b9ab31006b70b02022-12-22T02:18:52ZengElsevierMachine Learning with Applications2666-82702022-09-019100377On the application of clustering for extracting driving scenarios from vehicle dataNour Chetouane0Franz Wotawa1Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaCorresponding author.; Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaIf we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest.http://www.sciencedirect.com/science/article/pii/S2666827022000664Clustering for information extractionDriving data extractionExperimental evaluationComparison of clustering algorithms
spellingShingle	Nour Chetouane Franz Wotawa On the application of clustering for extracting driving scenarios from vehicle data Machine Learning with Applications Clustering for information extraction Driving data extraction Experimental evaluation Comparison of clustering algorithms
title	On the application of clustering for extracting driving scenarios from vehicle data
title_full	On the application of clustering for extracting driving scenarios from vehicle data
title_fullStr	On the application of clustering for extracting driving scenarios from vehicle data
title_full_unstemmed	On the application of clustering for extracting driving scenarios from vehicle data
title_short	On the application of clustering for extracting driving scenarios from vehicle data
title_sort	on the application of clustering for extracting driving scenarios from vehicle data
topic	Clustering for information extraction Driving data extraction Experimental evaluation Comparison of clustering algorithms
url	http://www.sciencedirect.com/science/article/pii/S2666827022000664
work_keys_str_mv	AT nourchetouane ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata AT franzwotawa ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata

On the application of clustering for extracting driving scenarios from vehicle data

Similar Items