Violence Detection Enhancement by Involving Convolutional Block Attention Modules Into Various Deep Learning Architectures: Comprehensive Case Study for UBI-Fights Dataset

Violence detection in surveillance videos is a complicated task, due to the requirements of extracting the spatio-temporal features in different videos environment, and various video perspective cases. Hereby, in this paper, different architectures are proposed to perform this task in high performan...

Full description

Bibliographic Details
Main Authors: Mahmoud Abdelkader Bashery Abbass, Hyun-Soo Kang
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10102455/
Description
Summary:Violence detection in surveillance videos is a complicated task, due to the requirements of extracting the spatio-temporal features in different videos environment, and various video perspective cases. Hereby, in this paper, different architectures are proposed to perform this task in high performance, by using the UBI-Fights dataset as a comprehensive case study. The proposed architectures are based on involving the Convolutional Block Attention Modules (CBAM) with other simple layers (e.g., ConvLSTM2D or Conv2D&LSTM). In addition, using the Categorical Focal Loss (CFL) as a loss function during various architecture training, to increase the focus on the most important features. To evaluate the proposed architectures, the performance metrics like Area Under the Curve (AUC), and Equal Error Rate (EER); are mainly used, to declare the architecture’s ability to identify the violence correctly, with low interaction value between classes. The performance results declare the ability of the proposed architectures, to achieve higher results that the state-of-the-art techniques. For example, the Conv2D&LSTM-based architecture gets an AUC value of 0.9493, and an EER value of 0.0507; which outperforms most of the other proposed ones, and the state-of-the-art performance.
ISSN:2169-3536