Condensed movies: story based retrieval with contextual embeddings
Our objective in this work is long range understanding of the narrative structure of movies. Instead of considering the entire movie, we propose to learn from the ‘key scenes’ of the movie, providing a condensed look at the full storyline. To this end, we make the following three contributions: (i)...
Main Authors: | , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
Springer
2021
|
_version_ | 1797077355373527040 |
---|---|
author | Bain, M Nagrani, A Brown, A Zisserman, A |
author_facet | Bain, M Nagrani, A Brown, A Zisserman, A |
author_sort | Bain, M |
collection | OXFORD |
description | Our objective in this work is long range understanding of the narrative structure of movies. Instead of considering the entire movie, we propose to learn from the ‘key scenes’ of the movie, providing a condensed look at the full storyline. To this end, we make the following three contributions: (i) We create the Condensed Movies Dataset (CMD) consisting of the key scenes from over 3 K movies: each key scene is accompanied by a high level semantic description of the scene, character face-tracks, and metadata about the movie. The dataset is scalable, obtained automatically from YouTube, and is freely available for anybody to download and use. It is also an order of magnitude larger than existing movie datasets in the number of movies; (ii) We provide a deep network baseline for text-to-video retrieval on our dataset, combining character, speech and visual cues into a single video embedding; and finally (iii) We demonstrate how the addition of context from other video clips improves retrieval performance. |
first_indexed | 2024-03-07T00:16:45Z |
format | Conference item |
id | oxford-uuid:7b1aedd3-bd28-4272-ad30-588f1cb49f8f |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T00:16:45Z |
publishDate | 2021 |
publisher | Springer |
record_format | dspace |
spelling | oxford-uuid:7b1aedd3-bd28-4272-ad30-588f1cb49f8f2022-03-26T20:48:25ZCondensed movies: story based retrieval with contextual embeddingsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:7b1aedd3-bd28-4272-ad30-588f1cb49f8fEnglishSymplectic ElementsSpringer2021Bain, MNagrani, ABrown, AZisserman, AOur objective in this work is long range understanding of the narrative structure of movies. Instead of considering the entire movie, we propose to learn from the ‘key scenes’ of the movie, providing a condensed look at the full storyline. To this end, we make the following three contributions: (i) We create the Condensed Movies Dataset (CMD) consisting of the key scenes from over 3 K movies: each key scene is accompanied by a high level semantic description of the scene, character face-tracks, and metadata about the movie. The dataset is scalable, obtained automatically from YouTube, and is freely available for anybody to download and use. It is also an order of magnitude larger than existing movie datasets in the number of movies; (ii) We provide a deep network baseline for text-to-video retrieval on our dataset, combining character, speech and visual cues into a single video embedding; and finally (iii) We demonstrate how the addition of context from other video clips improves retrieval performance. |
spellingShingle | Bain, M Nagrani, A Brown, A Zisserman, A Condensed movies: story based retrieval with contextual embeddings |
title | Condensed movies: story based retrieval with contextual embeddings |
title_full | Condensed movies: story based retrieval with contextual embeddings |
title_fullStr | Condensed movies: story based retrieval with contextual embeddings |
title_full_unstemmed | Condensed movies: story based retrieval with contextual embeddings |
title_short | Condensed movies: story based retrieval with contextual embeddings |
title_sort | condensed movies story based retrieval with contextual embeddings |
work_keys_str_mv | AT bainm condensedmoviesstorybasedretrievalwithcontextualembeddings AT nagrania condensedmoviesstorybasedretrievalwithcontextualembeddings AT browna condensedmoviesstorybasedretrievalwithcontextualembeddings AT zissermana condensedmoviesstorybasedretrievalwithcontextualembeddings |