Illation of Video Visual Relation Detection Based on Graph Neural Network
Visual relation detection task is the bridge between semantic text and image information, and it can better express the content of images or video through the relation triple <subject, predicate, object>. This significant research can be applied to image question answering, vid...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9547267/ |
_version_ | 1819145435367866368 |
---|---|
author | Mingcheng Qu Jianxun Cui Yuxi Nie Tonghua Su |
author_facet | Mingcheng Qu Jianxun Cui Yuxi Nie Tonghua Su |
author_sort | Mingcheng Qu |
collection | DOAJ |
description | Visual relation detection task is the bridge between semantic text and image information, and it can better express the content of images or video through the relation triple <subject, predicate, object>. This significant research can be applied to image question answering, video subtitles and other directions Using the video as input to the task of visual relationship detection receives less attention. Therefore, we propose an algorithm based on the graph convolution neural network and multi-hypothesis tree to implement video relationship prediction. Video visual relationship detection algorithm is divided into three steps: Firstly, the motion trajectories of the subject and object in the input video clip are generated; Secondly, a VRGE network module based on the graph convolution neural network is proposed to predict the relationship between objects in the video clip; Finally, the relationship triplets are formed through the multi-hypothesis fusion algorithm (MHF) and the visual relationship. We have verified our method on the benchmark ImageNet-VidVRD dataset. The experimental results demonstrate that our proposed method can achieve a satisfactory accuracy of 29.05% and recall of 10.18% for visual relation detection. |
first_indexed | 2024-12-22T12:57:59Z |
format | Article |
id | doaj.art-dd585be16d894eec836b8e7a7b9c2668 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T12:57:59Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-dd585be16d894eec836b8e7a7b9c26682022-12-21T18:25:04ZengIEEEIEEE Access2169-35362021-01-01914114414115310.1109/ACCESS.2021.31152609547267Illation of Video Visual Relation Detection Based on Graph Neural NetworkMingcheng Qu0Jianxun Cui1Yuxi Nie2https://orcid.org/0000-0001-6468-6898Tonghua Su3Department of Software, Harbin Institute of Technology, Harbin, ChinaDepartment of Software, Harbin Institute of Technology, Harbin, ChinaDepartment of Software, Harbin Institute of Technology, Harbin, ChinaDepartment of Software, Harbin Institute of Technology, Harbin, ChinaVisual relation detection task is the bridge between semantic text and image information, and it can better express the content of images or video through the relation triple <subject, predicate, object>. This significant research can be applied to image question answering, video subtitles and other directions Using the video as input to the task of visual relationship detection receives less attention. Therefore, we propose an algorithm based on the graph convolution neural network and multi-hypothesis tree to implement video relationship prediction. Video visual relationship detection algorithm is divided into three steps: Firstly, the motion trajectories of the subject and object in the input video clip are generated; Secondly, a VRGE network module based on the graph convolution neural network is proposed to predict the relationship between objects in the video clip; Finally, the relationship triplets are formed through the multi-hypothesis fusion algorithm (MHF) and the visual relationship. We have verified our method on the benchmark ImageNet-VidVRD dataset. The experimental results demonstrate that our proposed method can achieve a satisfactory accuracy of 29.05% and recall of 10.18% for visual relation detection.https://ieeexplore.ieee.org/document/9547267/Video visual relation detectiontarget detectiongraph convolutional neural network |
spellingShingle | Mingcheng Qu Jianxun Cui Yuxi Nie Tonghua Su Illation of Video Visual Relation Detection Based on Graph Neural Network IEEE Access Video visual relation detection target detection graph convolutional neural network |
title | Illation of Video Visual Relation Detection Based on Graph Neural Network |
title_full | Illation of Video Visual Relation Detection Based on Graph Neural Network |
title_fullStr | Illation of Video Visual Relation Detection Based on Graph Neural Network |
title_full_unstemmed | Illation of Video Visual Relation Detection Based on Graph Neural Network |
title_short | Illation of Video Visual Relation Detection Based on Graph Neural Network |
title_sort | illation of video visual relation detection based on graph neural network |
topic | Video visual relation detection target detection graph convolutional neural network |
url | https://ieeexplore.ieee.org/document/9547267/ |
work_keys_str_mv | AT mingchengqu illationofvideovisualrelationdetectionbasedongraphneuralnetwork AT jianxuncui illationofvideovisualrelationdetectionbasedongraphneuralnetwork AT yuxinie illationofvideovisualrelationdetectionbasedongraphneuralnetwork AT tonghuasu illationofvideovisualrelationdetectionbasedongraphneuralnetwork |