Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature

Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature

Automatic video description, or video captioning, is a challenging yet much attractive task. It aims to combine video with text. Multiple methods have been proposed based on neural networks, utilizing Convolutional Neural Networks (CNN) to extract features, and Recurrent Neural Networks (RNN) to enc...

Full description

Bibliographic Details
Main Authors:	Xu, Yuecong, Yang, Jianfei, Mao, Kezhi
Other Authors:	School of Electrical and Electronic Engineering
Format:	Journal Article
Language:	English
Published:	2021
Subjects:	Engineering::Electrical and electronic engineering Video Captioning Long Short-term Memory
Online Access:	https://hdl.handle.net/10356/151341

Similar Items

Step by Step: A Gradual Approach for Dense Video Captioning
by: Wangyu Choi, et al.
Published: (2023-01-01)

Cross-modal graph with meta concepts for video captioning
by: Wang, Hao, et al.
Published: (2022)

Video Captioning Based on Channel Soft Attention and Semantic Reconstructor
by: Zhou Lei, et al.
Published: (2021-02-01)

Towards Human-Interactive Controllable Video Captioning with Efficient Modeling
by: Yoonseok Heo, et al.
Published: (2024-06-01)

Bilingual video captioning model for enhanced video retrieval
by: Norah Alrebdi, et al.
Published: (2024-01-01)

Parallel Dense Video Caption Generation with Multi-Modal Features
by: Xuefei Huang, et al.
Published: (2023-08-01)

Adaptive Curriculum Learning for Video Captioning
by: Shanhao Li, et al.
Published: (2022-01-01)

Automatic Image and Video Caption Generation With Deep Learning: A Concise Review and Algorithmic Overlap
by: Soheyla Amirian, et al.
Published: (2020-01-01)

Comparing the effectiveness of explicit EAL feedback through slideshow (text+audio) and captioned video
by: Jonathan Harrison
Published: (2022-04-01)

Fusion of Multi-Modal Features to Enhance Dense Video Caption
by: Xuefei Huang, et al.
Published: (2023-06-01)

Video captioning with stacked attention and semantic hard pull
by: Md. Mushfiqur Rahman, et al.
Published: (2021-08-01)

UAT: Universal Attention Transformer for Video Captioning
by: Heeju Im, et al.
Published: (2022-06-01)

Parallel Pathway Dense Video Captioning With Deformable Transformer
by: Wangyu Choi, et al.
Published: (2022-01-01)

MIRA-CAP: Memory-Integrated Retrieval-Augmented Captioning for State-of-the-Art Image and Video Captioning
by: Sabina Umirzakova, et al.
Published: (2024-12-01)

CapERA: Captioning Events in Aerial Videos
by: Laila Bashmal, et al.
Published: (2023-04-01)

Real-time Arabic Video Captioning Using CNN and Transformer Networks Based on Parallel Implementation
by: Adel Jalal Yousif, et al.
Published: (2024-03-01)

Video Captioning With Adaptive Attention and Mixed Loss Optimization
by: Huanhou Xiao, et al.
Published: (2019-01-01)

Action knowledge for video captioning with graph neural networks
by: Willy Fitra Hendria, et al.
Published: (2023-04-01)

Video Captions for Online Courses: Do YouTube’s Auto-generated Captions Meet Deaf Students’ Needs?
by: Becky Sue Parton
Published: (2016-08-01)

Video Captions for Online Courses: Do YouTube’s Auto-generated Captions Meet Deaf Students’ Needs?
by: Becky Sue Parton
Published: (2016-08-01)

Deconfounded image captioning: a causal retrospect
by: Yang, Xu, et al.
Published: (2022)

A Semantics-Assisted Video Captioning Model Trained With Scheduled Sampling
by: Haoran Chen, et al.
Published: (2020-09-01)

Deep learning and knowledge graph for image/video captioning: A review of datasets, evaluation metrics, and methods
by: Mohammad Saif Wajid, et al.
Published: (2024-01-01)

Fine-Grained Length Controllable Video Captioning With Ordinal Embeddings
by: Tomoya Nitta, et al.
Published: (2024-01-01)

PWS-DVC: Enhancing Weakly Supervised Dense Video Captioning With Pretraining Approach
by: Wangyu Choi, et al.
Published: (2023-01-01)

Teaching Medical English through Professional Captioning Videos
by: Džuganová Božena
Published: (2019-09-01)

Caption-Guided Interpretable Video Anomaly Detection Based on Memory Similarity
by: Yuzhi Shi, et al.
Published: (2024-01-01)

Explicit Image Caption Reasoning: Generating Accurate and Informative Captions for Complex Scenes with LMM
by: Mingzhang Cui, et al.
Published: (2024-06-01)

Quality Enhancement Based Video Captioning in Video Communication Systems
by: The Van Le, et al.
Published: (2024-01-01)

Evaluation metrics for video captioning: A survey
by: Andrei de Souza Inácio, et al.
Published: (2023-09-01)

Novel Object Captioning with Semantic Match from External Knowledge
by: Sen Du, et al.
Published: (2023-07-01)

Exploring deep learning approaches for video captioning: A comprehensive review
by: Adel Jalal Yousif, et al.
Published: (2023-12-01)

Context-aware visual policy network for fine-grained image captioning
by: Zha, Zheng-Jun, et al.
Published: (2022)

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning
by: An-An Liu, et al.
Published: (2018-01-01)

Stack-VS : stacked visual-semantic attention for image caption generation
by: Cheng, Ling, et al.
Published: (2021)

Text Augmentation Using BERT for Image Captioning
by: Viktar Atliha, et al.
Published: (2020-08-01)

Video captioning based on vision transformer and reinforcement learning
by: Hong Zhao, et al.
Published: (2022-03-01)

Understanding Objects in Video: Object-Oriented Video Captioning via Structured Trajectory and Adversarial Learning
by: Fangyi Zhu, et al.
Published: (2020-01-01)

Learning to collocate Visual-Linguistic Neural Modules for image captioning
by: Yang, Xu, et al.
Published: (2023)

Vision-Text Cross-Modal Fusion for Accurate Video Captioning
by: Kaouther Ouenniche, et al.
Published: (2023-01-01)