Controllable attention for structured layered video decomposition

Controllable attention for structured layered video decomposition

The objective of this paper is to be able to separate a video into its natural layers, and to control which of the separated layers to attend to. For example, to be able to separate reflections, transparency or object motion. We make the following three contributions: (i) we introduce a new structur...

Full description

Bibliographic Details
Main Authors:	Alayrac, J-B, Carreira, J, Arandjelovic, R, Zisserman, A
Format:	Conference item
Language:	English
Published:	IEEE 2020

Similar Items

The visual centrifuge: Model-free layered video representations
by: Alayrac, J-B, et al.
Published: (2020)

Visual grounding in video for unsupervised word translation
by: Sigurdsson, GA, et al.
Published: (2020)

Video action transformer network
by: Girdhar, R, et al.
Published: (2020)

Massively parallel video networks
by: Carreira, J, et al.
Published: (2018)

End-to-end learning of visual representations from uncurated instructional videos
by: Miech, A, et al.
Published: (2020)

Input-level inductive biases for 3D reconstruction
by: Yifan, W, et al.
Published: (2022)

On-the-fly learning for visual search of large-scale image and video datasets
by: Chatfield, K, et al.
Published: (2015)

Learning layered motion segmentations of video
by: Kumar, MP, et al.
Published: (2005)

Learning layered motion segmentations of video
by: Kumar, M, et al.
Published: (2008)

Learning layered motion segmentations of video
by: Pawan Kumar, M, et al.
Published: (2007)

Layered neural rendering for retiming people in video
by: Lu, E, et al.
Published: (2020)

Finding visual attention regions in videos
by: Ang, Kenny Wen Bin
Published: (2010)

Advanced video coding based on matrix decomposition
by: Gu, Zhouye
Published: (2014)

Visiting the Invisible: layer-by-layer completed scene decomposition
by: Zheng, Chuanxia, et al.
Published: (2023)

Video Google: efficient visual search of videos
by: Sivic, J, et al.
Published: (2007)

Layer decomposition of design model for manufacture
by: Karthikeyan Duraisamy.
Published: (2008)

Objects that sound
by: Arandjelović, R, et al.
Published: (2018)

DisLocation: scalable descriptor distinctiveness for location recognition
by: Arandjelović, R, et al.
Published: (2015)

Three things everyone should know to improve object retrieval
by: Arandjelović, R, et al.
Published: (2012)

Extremely low bit-rate nearest neighbor search using a set compression tree
by: Arandjelović, R, et al.
Published: (2014)

Visual vocabulary with a semantic twist
by: Arandjelović, R, et al.
Published: (2015)

Name that sculpture
by: Arandjelović, R, et al.
Published: (2012)

Multiple queries for large scale specific object retrieval
by: Arandjelovic, R, et al.
Published: (2012)

Smooth object retrieval using a bag of boundaries
by: Arandjelović, R, et al.
Published: (2012)

Look, listen and learn
by: Arandjelovic, R, et al.
Published: (2017)

All about VLAD
by: Arandjelović, R, et al.
Published: (2013)

Detection of visual attention regions in images and videos
by: Hu, Yiqun
Published: (2009)

Empirical evaluation of decomposition strategy for wavelet video compression.
by: Fakeh, Rohmad, et al.
Published: (2009)

Quo Vadis, action recognition? A new model and the kinetics dataset
by: Carreira, J, et al.
Published: (2017)

Scalable design of structured controllers using chordal decomposition
by: Zheng, Y, et al.
Published: (2017)

Real-time decoding and display of layered structured video
by: Chang, Tzu-Yun Teresa
Published: (2007)

The AXES PRO video search system
by: McGuinness, K, et al.
Published: (2013)

Video Google: a text retrieval approach to object matching in videos
by: Sivic, J, et al.
Published: (2003)

Automatic face recognition for film character retrieval in feature-length films
by: Arandjelović, O, et al.
Published: (2005)

Learning attentional policies for tracking and recognition in video with deep networks
by: Bazzani, L, et al.
Published: (2011)

Learning from one continuous video stream
by: Carreira, J, et al.
Published: (2024)

Keeping your eye on the ball: Trajectory attention in video transformers
by: Patrick, M, et al.
Published: (2021)

Efficient visual search for objects in videos
by: Sivic, J, et al.
Published: (2008)

Faces in places: compound query retrieval
by: Zhong, Y, et al.
Published: (2016)

Compact deep aggregation for set retrieval
by: Zhong, Y, et al.
Published: (2019)