Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model

Due to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion c...

Full description

Bibliographic Details
Main Authors:	Khalladi Sofiane Abdelkrim, Ouessai Asmâa, Benamara Nadir Kamel, Keche Mokhtar
Format:	Article
Language:	English
Published:	Sciendo 2024-02-01
Series:	Transport and Telecommunication
Subjects:	road traffic classification macroscopic approach vision transformers multi-head self-attention deep learning
Online Access:	https://doi.org/10.2478/ttj-2024-0003

_version_	1797303100327854080
author	Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar
author_facet	Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar
author_sort	Khalladi Sofiane Abdelkrim
collection	DOAJ
description	Due to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion classification. To evaluate this approach, we use the UCSD (University of California San Diego) dataset that includes different weather conditions (clear, overcast and rainy) and different traffic scenarios (light, medium and heavy). The classification accuracy reached a high level of 99.76% with this dataset and 99.37% when night-mode frames are added to it. The proposed MSViT based method outperforms the state-of-the-art macroscopic and microscopic methods that have been evaluated using the same UCSD dataset, which makes it an efficient solution for traffic congestion prediction.
first_indexed	2024-03-07T23:48:05Z
format	Article
id	doaj.art-57809967deb243b58cd15b8340c24944
institution	Directory Open Access Journal
issn	1407-6179
language	English
last_indexed	2024-03-07T23:48:05Z
publishDate	2024-02-01
publisher	Sciendo
record_format	Article
series	Transport and Telecommunication
spelling	doaj.art-57809967deb243b58cd15b8340c249442024-02-19T09:04:01ZengSciendoTransport and Telecommunication1407-61792024-02-01251203010.2478/ttj-2024-0003Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer ModelKhalladi Sofiane Abdelkrim0Ouessai Asmâa1Benamara Nadir Kamel2Keche Mokhtar3Signals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, Algeria2Faculty of Technology, Department of Telecommunications, Dr. Tahar Moulay University, Saida, AlgeriaSignals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, AlgeriaSignals and images laboratory, Faculty of Electrical Engineering, Department of Electronics, University of Sciences and Technology of Oran Mohamed Boudiaf USTO-MB, B.P. 1505, El Mnaouar-Bir el Djir-Oran, AlgeriaDue to rapid population growth, traffic congestion has become one of the major issues in urban areas. The utilization of technology may help to address this issue. This paper proposes a new Multi-head Self-attention Vision Transformer (MSViT) based macroscopic approach, for road traffic congestion classification. To evaluate this approach, we use the UCSD (University of California San Diego) dataset that includes different weather conditions (clear, overcast and rainy) and different traffic scenarios (light, medium and heavy). The classification accuracy reached a high level of 99.76% with this dataset and 99.37% when night-mode frames are added to it. The proposed MSViT based method outperforms the state-of-the-art macroscopic and microscopic methods that have been evaluated using the same UCSD dataset, which makes it an efficient solution for traffic congestion prediction.https://doi.org/10.2478/ttj-2024-0003road traffic classificationmacroscopic approachvision transformersmulti-head self-attentiondeep learning
spellingShingle	Khalladi Sofiane Abdelkrim Ouessai Asmâa Benamara Nadir Kamel Keche Mokhtar Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model Transport and Telecommunication road traffic classification macroscopic approach vision transformers multi-head self-attention deep learning
title	Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
title_full	Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
title_fullStr	Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
title_full_unstemmed	Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
title_short	Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model
title_sort	efficient road traffic video congestion classification based on the multi head self attention vision transformer model
topic	road traffic classification macroscopic approach vision transformers multi-head self-attention deep learning
url	https://doi.org/10.2478/ttj-2024-0003
work_keys_str_mv	AT khalladisofianeabdelkrim efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT ouessaiasmaa efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT benamaranadirkamel efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel AT kechemokhtar efficientroadtrafficvideocongestionclassificationbasedonthemultiheadselfattentionvisiontransformermodel

Efficient Road Traffic Video Congestion Classification Based on the Multi-Head Self-Attention Vision Transformer Model

Similar Items