Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment

This paper presents a novel approach to risk assessment by incorporating image captioning as a fundamental component to enhance the effectiveness of surveillance systems. The proposed surveillance system utilizes image captioning to generate descriptive captions that portray the relationship between...

Full description

Bibliographic Details
Main Authors: Minseong Jeon, Jaepil Ko, Kyungjoo Cheoi
Format: Article
Language:English
Published: MDPI AG 2024-01-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/24/1/292
_version_ 1797358156294127616
author Minseong Jeon
Jaepil Ko
Kyungjoo Cheoi
author_facet Minseong Jeon
Jaepil Ko
Kyungjoo Cheoi
author_sort Minseong Jeon
collection DOAJ
description This paper presents a novel approach to risk assessment by incorporating image captioning as a fundamental component to enhance the effectiveness of surveillance systems. The proposed surveillance system utilizes image captioning to generate descriptive captions that portray the relationship between objects, actions, and space elements within the observed scene. Subsequently, it evaluates the risk level based on the content of these captions. After defining the risk levels to be detected in the surveillance system, we constructed a dataset consisting of [Image-Caption-Danger Score]. Our dataset offers caption data presented in a unique sentence format, departing from conventional caption styles. This unique format enables a comprehensive interpretation of surveillance scenes by considering various elements, such as objects, actions, and spatial context. We fine-tuned the BLIP-2 model using our dataset to generate captions, and captions were then interpreted with BERT to evaluate the risk level of each scene, categorizing them into stages ranging from 1 to 7. Multiple experiments provided empirical support for the effectiveness of the proposed system, demonstrating significant accuracy rates of 92.3%, 89.8%, and 94.3% for three distinct risk levels: safety, hazard, and danger, respectively.
first_indexed 2024-03-08T14:57:47Z
format Article
id doaj.art-d8f79dc70af24aa58132bd68048404b8
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-08T14:57:47Z
publishDate 2024-01-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-d8f79dc70af24aa58132bd68048404b82024-01-10T15:09:25ZengMDPI AGSensors1424-82202024-01-0124129210.3390/s24010292Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk AssessmentMinseong Jeon0Jaepil Ko1Kyungjoo Cheoi2Department of Computer Science, Chungbuk National University, 1 Chungdae-ro, Seowon-gu, Cheongju, Chungbuk 28644, Republic of KoreaDepartment of Computer Engineering, Kumoh National Institute of Technology, 61 Daehak-ro, Gumi, Gyeongbuk 39177, Republic of KoreaDepartment of Computer Science, Chungbuk National University, 1 Chungdae-ro, Seowon-gu, Cheongju, Chungbuk 28644, Republic of KoreaThis paper presents a novel approach to risk assessment by incorporating image captioning as a fundamental component to enhance the effectiveness of surveillance systems. The proposed surveillance system utilizes image captioning to generate descriptive captions that portray the relationship between objects, actions, and space elements within the observed scene. Subsequently, it evaluates the risk level based on the content of these captions. After defining the risk levels to be detected in the surveillance system, we constructed a dataset consisting of [Image-Caption-Danger Score]. Our dataset offers caption data presented in a unique sentence format, departing from conventional caption styles. This unique format enables a comprehensive interpretation of surveillance scenes by considering various elements, such as objects, actions, and spatial context. We fine-tuned the BLIP-2 model using our dataset to generate captions, and captions were then interpreted with BERT to evaluate the risk level of each scene, categorizing them into stages ranging from 1 to 7. Multiple experiments provided empirical support for the effectiveness of the proposed system, demonstrating significant accuracy rates of 92.3%, 89.8%, and 94.3% for three distinct risk levels: safety, hazard, and danger, respectively.https://www.mdpi.com/1424-8220/24/1/292surveillance systemimage captioningdescriptive captionsrisk assessmentBLIP-2BERT
spellingShingle Minseong Jeon
Jaepil Ko
Kyungjoo Cheoi
Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
Sensors
surveillance system
image captioning
descriptive captions
risk assessment
BLIP-2
BERT
title Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
title_full Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
title_fullStr Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
title_full_unstemmed Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
title_short Enhancing Surveillance Systems: Integration of Object, Behavior, and Space Information in Captions for Advanced Risk Assessment
title_sort enhancing surveillance systems integration of object behavior and space information in captions for advanced risk assessment
topic surveillance system
image captioning
descriptive captions
risk assessment
BLIP-2
BERT
url https://www.mdpi.com/1424-8220/24/1/292
work_keys_str_mv AT minseongjeon enhancingsurveillancesystemsintegrationofobjectbehaviorandspaceinformationincaptionsforadvancedriskassessment
AT jaepilko enhancingsurveillancesystemsintegrationofobjectbehaviorandspaceinformationincaptionsforadvancedriskassessment
AT kyungjoocheoi enhancingsurveillancesystemsintegrationofobjectbehaviorandspaceinformationincaptionsforadvancedriskassessment