Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames

Artificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships amon...

Full description

Bibliographic Details
Main Authors:	Yihang Zhang, Yunsick Sung
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	Mathematics
Subjects:	artificial intelligence deep learning traffic-accident detection background subtraction CNN encoder Transformer decoder
Online Access:	https://www.mdpi.com/2227-7390/11/13/2884

_version_	1797591294859542528
author	Yihang Zhang Yunsick Sung
author_facet	Yihang Zhang Yunsick Sung
author_sort	Yihang Zhang
collection	DOAJ
description	Artificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships among objects, and Recurrent Neural Network (RNN)-based approaches suffer from low processing speeds and cannot detect traffic accidents simultaneously across multiple frames. Furthermore, these networks dismiss background interference in input video frames. This paper proposes a framework that begins by subtracting the background based on You Only Look Once (YOLOv5), which adaptively reduces background interference when detecting objects. Subsequently, the CNN encoder and Transformer decoder are combined into an end-to-end model to extract the spatial and temporal features between different time points, allowing for a parallel analysis between input video frames. The proposed framework was evaluated on the Car Crash Dataset through a series of comparison and ablation experiments. Our framework was benchmarked against three accident-detection models to evaluate its effectiveness, and the proposed framework demonstrated a superior accuracy of approximately 96%. The results of the ablation experiments indicate that when background subtraction was not incorporated into the proposed framework, the values of all evaluation indicators decreased by approximately 3%.
first_indexed	2024-03-11T01:35:22Z
format	Article
id	doaj.art-95db5638d20145da90673ae681e268b6
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-11T01:35:22Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-95db5638d20145da90673ae681e268b62023-11-18T17:02:43ZengMDPI AGMathematics2227-73902023-06-011113288410.3390/math11132884Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video FramesYihang Zhang0Yunsick Sung1Department of Autonomous Things Intelligence, Dongguk University-Seoul, Seoul 04620, Republic of KoreaDivision of AI Software Convergence, Dongguk University-Seoul, Seoul 04620, Republic of KoreaArtificial intelligence plays a significant role in traffic-accident detection. Traffic accidents involve a cascade of inadvertent events, making traditional detection approaches challenging. For instance, Convolutional Neural Network (CNN)-based approaches cannot analyze temporal relationships among objects, and Recurrent Neural Network (RNN)-based approaches suffer from low processing speeds and cannot detect traffic accidents simultaneously across multiple frames. Furthermore, these networks dismiss background interference in input video frames. This paper proposes a framework that begins by subtracting the background based on You Only Look Once (YOLOv5), which adaptively reduces background interference when detecting objects. Subsequently, the CNN encoder and Transformer decoder are combined into an end-to-end model to extract the spatial and temporal features between different time points, allowing for a parallel analysis between input video frames. The proposed framework was evaluated on the Car Crash Dataset through a series of comparison and ablation experiments. Our framework was benchmarked against three accident-detection models to evaluate its effectiveness, and the proposed framework demonstrated a superior accuracy of approximately 96%. The results of the ablation experiments indicate that when background subtraction was not incorporated into the proposed framework, the values of all evaluation indicators decreased by approximately 3%.https://www.mdpi.com/2227-7390/11/13/2884artificial intelligencedeep learningtraffic-accident detectionbackground subtractionCNN encoderTransformer decoder
spellingShingle	Yihang Zhang Yunsick Sung Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames Mathematics artificial intelligence deep learning traffic-accident detection background subtraction CNN encoder Transformer decoder
title	Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
title_full	Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
title_fullStr	Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
title_full_unstemmed	Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
title_short	Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames
title_sort	traffic accident detection using background subtraction and cnn encoder transformer decoder in video frames
topic	artificial intelligence deep learning traffic-accident detection background subtraction CNN encoder Transformer decoder
url	https://www.mdpi.com/2227-7390/11/13/2884
work_keys_str_mv	AT yihangzhang trafficaccidentdetectionusingbackgroundsubtractionandcnnencodertransformerdecoderinvideoframes AT yunsicksung trafficaccidentdetectionusingbackgroundsubtractionandcnnencodertransformerdecoderinvideoframes

Traffic Accident Detection Using Background Subtraction and CNN Encoder–Transformer Decoder in Video Frames

Similar Items