On the application of clustering for extracting driving scenarios from vehicle data
If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-09-01
|
Series: | Machine Learning with Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666827022000664 |
_version_ | 1817995017852026880 |
---|---|
author | Nour Chetouane Franz Wotawa |
author_facet | Nour Chetouane Franz Wotawa |
author_sort | Nour Chetouane |
collection | DOAJ |
description | If we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest. |
first_indexed | 2024-04-14T01:59:43Z |
format | Article |
id | doaj.art-1278970bea1644d8a1b9ab31006b70b0 |
institution | Directory Open Access Journal |
issn | 2666-8270 |
language | English |
last_indexed | 2024-04-14T01:59:43Z |
publishDate | 2022-09-01 |
publisher | Elsevier |
record_format | Article |
series | Machine Learning with Applications |
spelling | doaj.art-1278970bea1644d8a1b9ab31006b70b02022-12-22T02:18:52ZengElsevierMachine Learning with Applications2666-82702022-09-019100377On the application of clustering for extracting driving scenarios from vehicle dataNour Chetouane0Franz Wotawa1Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaCorresponding author.; Christian-Doppler Laboratory for Quality Assurance Methodologies for Autonomous Cyber–Physical Systems, Institute for Software Technology, Graz University of Technology, Inffeldgasse 16b/2, Graz, 8010, AustriaIf we want to extract test cases from driving data for the purpose of testing vehicles, we want to avoid using similar test cases. In this paper, we focus on this topic. We provide a method for extracting driving episodes from data utilizing clustering algorithms. This method starts with clustering driving data. Afterward, data points representing time-ordered sequences are obtained from the cluster forming a driving episode. Besides outlying the foundations, we present the results of an experimental evaluation where we considered six different clustering algorithms and available driving data from three German cities. To evaluate the cluster quality, we utilize three cluster validity metrics. In addition, we introduce a measure for the quality of extracted episodes relying on the Pearson coefficient. Experimental evaluation showed that the Pearson coefficient can rank clustering algorithms better than the three cluster validity metrics. We can extract meaningful episodes from driving data using any clustering algorithm considering four to eight clusters. Combining k-means clustering with auto-encoders leads to the best Pearson correlation. SOM is the slowest clustering method, and Canopy is the fastest.http://www.sciencedirect.com/science/article/pii/S2666827022000664Clustering for information extractionDriving data extractionExperimental evaluationComparison of clustering algorithms |
spellingShingle | Nour Chetouane Franz Wotawa On the application of clustering for extracting driving scenarios from vehicle data Machine Learning with Applications Clustering for information extraction Driving data extraction Experimental evaluation Comparison of clustering algorithms |
title | On the application of clustering for extracting driving scenarios from vehicle data |
title_full | On the application of clustering for extracting driving scenarios from vehicle data |
title_fullStr | On the application of clustering for extracting driving scenarios from vehicle data |
title_full_unstemmed | On the application of clustering for extracting driving scenarios from vehicle data |
title_short | On the application of clustering for extracting driving scenarios from vehicle data |
title_sort | on the application of clustering for extracting driving scenarios from vehicle data |
topic | Clustering for information extraction Driving data extraction Experimental evaluation Comparison of clustering algorithms |
url | http://www.sciencedirect.com/science/article/pii/S2666827022000664 |
work_keys_str_mv | AT nourchetouane ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata AT franzwotawa ontheapplicationofclusteringforextractingdrivingscenariosfromvehicledata |