Multiple queries for large scale specific object retrieval
The aim of large scale specific-object image retrieval systems is to instantaneously find images that contain the query object in the image database. Current systems, for example Google Goggles, concentrate on querying using a single view of an object, e.g. a photo a user takes with his mobile phone...
Main Authors: | , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
British Machine Vision Association
2012
|
_version_ | 1824458725993218048 |
---|---|
author | Arandjelovic, R Zisserman, A |
author_facet | Arandjelovic, R Zisserman, A |
author_sort | Arandjelovic, R |
collection | OXFORD |
description | The aim of large scale specific-object image retrieval systems is to instantaneously
find images that contain the query object in the image database. Current systems, for
example Google Goggles, concentrate on querying using a single view of an object, e.g. a
photo a user takes with his mobile phone, in order to answer the question “what is this?”.
Here we consider the somewhat converse problem of finding all images of an object given
that the user knows what he is looking for; so the input modality is text, not an image.
This problem is useful in a number of settings, for example media production teams are
interested in searching internal databases for images or video footage to accompany news
reports and newspaper articles.
Given a textual query (e.g. “coca cola bottle”), our approach is to first obtain multiple
images of the queried object using textual Google image search. These images are then
used to visually query the target database to discover images containing the object of
interest. We compare a number of different methods for combining the multiple query
images, including discriminative learning. We show that issuing multiple queries significantly improves recall and enables the system to find quite challenging occurrences of
the queried object.
The system is evaluated quantitatively on the standard Oxford Buildings benchmark
dataset where it achieves very high retrieval performance, and also qualitatively on the
TrecVid 2011 known-item search dataset. |
first_indexed | 2025-02-19T04:30:28Z |
format | Conference item |
id | oxford-uuid:b88d390e-0bb1-4860-a09d-1aac90d3065f |
institution | University of Oxford |
language | English |
last_indexed | 2025-02-19T04:30:28Z |
publishDate | 2012 |
publisher | British Machine Vision Association |
record_format | dspace |
spelling | oxford-uuid:b88d390e-0bb1-4860-a09d-1aac90d3065f2024-12-18T14:09:03ZMultiple queries for large scale specific object retrievalConference itemhttp://purl.org/coar/resource_type/c_5794uuid:b88d390e-0bb1-4860-a09d-1aac90d3065fEnglishSymplectic ElementsBritish Machine Vision Association2012Arandjelovic, RZisserman, AThe aim of large scale specific-object image retrieval systems is to instantaneously find images that contain the query object in the image database. Current systems, for example Google Goggles, concentrate on querying using a single view of an object, e.g. a photo a user takes with his mobile phone, in order to answer the question “what is this?”. Here we consider the somewhat converse problem of finding all images of an object given that the user knows what he is looking for; so the input modality is text, not an image. This problem is useful in a number of settings, for example media production teams are interested in searching internal databases for images or video footage to accompany news reports and newspaper articles. Given a textual query (e.g. “coca cola bottle”), our approach is to first obtain multiple images of the queried object using textual Google image search. These images are then used to visually query the target database to discover images containing the object of interest. We compare a number of different methods for combining the multiple query images, including discriminative learning. We show that issuing multiple queries significantly improves recall and enables the system to find quite challenging occurrences of the queried object. The system is evaluated quantitatively on the standard Oxford Buildings benchmark dataset where it achieves very high retrieval performance, and also qualitatively on the TrecVid 2011 known-item search dataset. |
spellingShingle | Arandjelovic, R Zisserman, A Multiple queries for large scale specific object retrieval |
title | Multiple queries for large scale specific object retrieval |
title_full | Multiple queries for large scale specific object retrieval |
title_fullStr | Multiple queries for large scale specific object retrieval |
title_full_unstemmed | Multiple queries for large scale specific object retrieval |
title_short | Multiple queries for large scale specific object retrieval |
title_sort | multiple queries for large scale specific object retrieval |
work_keys_str_mv | AT arandjelovicr multiplequeriesforlargescalespecificobjectretrieval AT zissermana multiplequeriesforlargescalespecificobjectretrieval |