Multiple queries for large scale specific object retrieval

The aim of large scale specific-object image retrieval systems is to instantaneously find images that contain the query object in the image database. Current systems, for example Google Goggles, concentrate on querying using a single view of an object, e.g. a photo a user takes with his mobile phone...

Full description

Bibliographic Details
Main Authors: Arandjelovic, R, Zisserman, A
Format: Conference item
Language:English
Published: British Machine Vision Association 2012
_version_ 1824458725993218048
author Arandjelovic, R
Zisserman, A
author_facet Arandjelovic, R
Zisserman, A
author_sort Arandjelovic, R
collection OXFORD
description The aim of large scale specific-object image retrieval systems is to instantaneously find images that contain the query object in the image database. Current systems, for example Google Goggles, concentrate on querying using a single view of an object, e.g. a photo a user takes with his mobile phone, in order to answer the question “what is this?”. Here we consider the somewhat converse problem of finding all images of an object given that the user knows what he is looking for; so the input modality is text, not an image. This problem is useful in a number of settings, for example media production teams are interested in searching internal databases for images or video footage to accompany news reports and newspaper articles. Given a textual query (e.g. “coca cola bottle”), our approach is to first obtain multiple images of the queried object using textual Google image search. These images are then used to visually query the target database to discover images containing the object of interest. We compare a number of different methods for combining the multiple query images, including discriminative learning. We show that issuing multiple queries significantly improves recall and enables the system to find quite challenging occurrences of the queried object. The system is evaluated quantitatively on the standard Oxford Buildings benchmark dataset where it achieves very high retrieval performance, and also qualitatively on the TrecVid 2011 known-item search dataset.
first_indexed 2025-02-19T04:30:28Z
format Conference item
id oxford-uuid:b88d390e-0bb1-4860-a09d-1aac90d3065f
institution University of Oxford
language English
last_indexed 2025-02-19T04:30:28Z
publishDate 2012
publisher British Machine Vision Association
record_format dspace
spelling oxford-uuid:b88d390e-0bb1-4860-a09d-1aac90d3065f2024-12-18T14:09:03ZMultiple queries for large scale specific object retrievalConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b88d390e-0bb1-4860-a09d-1aac90d3065fEnglishSymplectic ElementsBritish Machine Vision Association2012Arandjelovic, RZisserman, AThe aim of large scale specific-object image retrieval systems is to instantaneously find images that contain the query object in the image database. Current systems, for example Google Goggles, concentrate on querying using a single view of an object, e.g. a photo a user takes with his mobile phone, in order to answer the question “what is this?”. Here we consider the somewhat converse problem of finding all images of an object given that the user knows what he is looking for; so the input modality is text, not an image. This problem is useful in a number of settings, for example media production teams are interested in searching internal databases for images or video footage to accompany news reports and newspaper articles. Given a textual query (e.g. “coca cola bottle”), our approach is to first obtain multiple images of the queried object using textual Google image search. These images are then used to visually query the target database to discover images containing the object of interest. We compare a number of different methods for combining the multiple query images, including discriminative learning. We show that issuing multiple queries significantly improves recall and enables the system to find quite challenging occurrences of the queried object. The system is evaluated quantitatively on the standard Oxford Buildings benchmark dataset where it achieves very high retrieval performance, and also qualitatively on the TrecVid 2011 known-item search dataset.
spellingShingle Arandjelovic, R
Zisserman, A
Multiple queries for large scale specific object retrieval
title Multiple queries for large scale specific object retrieval
title_full Multiple queries for large scale specific object retrieval
title_fullStr Multiple queries for large scale specific object retrieval
title_full_unstemmed Multiple queries for large scale specific object retrieval
title_short Multiple queries for large scale specific object retrieval
title_sort multiple queries for large scale specific object retrieval
work_keys_str_mv AT arandjelovicr multiplequeriesforlargescalespecificobjectretrieval
AT zissermana multiplequeriesforlargescalespecificobjectretrieval