Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
Due to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion c...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2024-02-01
|
Series: | Transport and Telecommunication |
Subjects: | |
Online Access: | https://doi.org/10.2478/ttj-2024-0003 |
_version_ | 1797303100327854080 |
---|---|
author | Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar |
author_facet | Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar |
author_sort | Khalladi Sofiane Abdelkrim |
collection | DOAJ |
description | Due to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion classification. To evaluate this approach, we use the UCSD (University of California San Diego) dataset that includes different weather conditions (clear, overcast and rainy) and different traffic scenarios (light, medium and heavy). The classification accuracy reached a high level of 99.76% with this dataset and 99.37% when night-mode frames are added to it. The proposed MSViT based method outperforms the state-of-the-art macroscopic and microscopic methods that have been evaluated using the same UCSD dataset, which makes it an efficient solution for traffic congestion prediction. |
first_indexed | 2024-03-07T23:48:05Z |
format | Article |
id | doaj.art-57809967deb243b58cd15b8340c24944 |
institution | Directory Open Access Journal |
issn | 1407-6179 |
language | English |
last_indexed | 2024-03-07T23:48:05Z |
publishDate | 2024-02-01 |
publisher | Sciendo |
record_format | Article |
series | Transport and Telecommunication |
spelling | doaj.art-57809967deb243b58cd15b8340c249442024-02-19T09:04:01ZengSciendoTransport and Telecommunication1407-61792024-02-01251203010.2478/ttj-2024-0003Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer ModelKhalladi Sofiane Abdelkrim0Ouessai Asmâa1Benamara Nadir Kamel2Keche Mokhtar3Signals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, Algeria2Faculty of Technology, Department of Telecommunications, Dr. Tahar Moulay University, Saida, AlgeriaSignals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, AlgeriaSignals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, AlgeriaDue to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion classification. To evaluate this approach, we use the UCSD (University of California San Diego) dataset that includes different weather conditions (clear, overcast and rainy) and different traffic scenarios (light, medium and heavy). The classification accuracy reached a high level of 99.76% with this dataset and 99.37% when night-mode frames are added to it. The proposed MSViT based method outperforms the state-of-the-art macroscopic and microscopic methods that have been evaluated using the same UCSD dataset, which makes it an efficient solution for traffic congestion prediction.https://doi.org/10.2478/ttj-2024-0003road traffic classificationmacroscopic approachvision transformersmulti-head self-attentiondeep learning |
spellingShingle | Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model Transport and Telecommunication road traffic classification macroscopic approach vision transformers multi-head self-attention deep learning |
title | Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model |
title_full | Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model |
title_fullStr | Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model |
title_full_unstemmed | Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model |
title_short | Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model |
title_sort | efficient road traffic video congestion classification based on the multi head self attention vision transformer model |
topic | road traffic classification macroscopic approach vision transformers multi-head self-attention deep learning |
url | https://doi.org/10.2478/ttj-2024-0003 |
work_keys_str_mv | AT khalladisofianeabdelkrim efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT ouessaiasmaa efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT benamaranadirkamel efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT kechemokhtar efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel |