Group Invariant Deep Representations for Image Instance Retrieval

Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining...

Full description

Bibliographic Details
Main Authors: Morère, Olivier, Veillard, Antoine, Lin, Jie, Petta, Julie, Chandrasekhar, Vijay, Poggio, Tomaso
Format: Technical Report
Language:en_US
Published: Center for Brains, Minds and Machines (CBMM) 2016
Subjects:
Online Access:http://hdl.handle.net/1721.1/100796
_version_ 1826198277229379584
author Morère, Olivier
Veillard, Antoine
Lin, Jie
Petta, Julie
Chandrasekhar, Vijay
Poggio, Tomaso
author_facet Morère, Olivier
Veillard, Antoine
Lin, Jie
Petta, Julie
Chandrasekhar, Vijay
Poggio, Tomaso
author_sort Morère, Olivier
collection MIT
description Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to over fitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-theart compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step –quantization or hashing– may be able to further improve the competitiveness of the descriptors.
first_indexed 2024-09-23T11:02:21Z
format Technical Report
id mit-1721.1/100796
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T11:02:21Z
publishDate 2016
publisher Center for Brains, Minds and Machines (CBMM)
record_format dspace
spelling mit-1721.1/1007962019-04-11T13:29:43Z Group Invariant Deep Representations for Image Instance Retrieval Morère, Olivier Veillard, Antoine Lin, Jie Petta, Julie Chandrasekhar, Vijay Poggio, Tomaso Global Image Descriptors Convolutional Neural Networks (CNN) Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to over fitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-theart compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step –quantization or hashing– may be able to further improve the competitiveness of the descriptors. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216. 2016-01-12T01:09:23Z 2016-01-12T01:09:23Z 2016-01-11 Technical Report Working Paper http://hdl.handle.net/1721.1/100796 en_US CBMM Memo Series;043 Attribution-NonCommercial 3.0 United States http://creativecommons.org/licenses/by-nc/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle Global Image Descriptors
Convolutional Neural Networks (CNN)
Morère, Olivier
Veillard, Antoine
Lin, Jie
Petta, Julie
Chandrasekhar, Vijay
Poggio, Tomaso
Group Invariant Deep Representations for Image Instance Retrieval
title Group Invariant Deep Representations for Image Instance Retrieval
title_full Group Invariant Deep Representations for Image Instance Retrieval
title_fullStr Group Invariant Deep Representations for Image Instance Retrieval
title_full_unstemmed Group Invariant Deep Representations for Image Instance Retrieval
title_short Group Invariant Deep Representations for Image Instance Retrieval
title_sort group invariant deep representations for image instance retrieval
topic Global Image Descriptors
Convolutional Neural Networks (CNN)
url http://hdl.handle.net/1721.1/100796
work_keys_str_mv AT morereolivier groupinvariantdeeprepresentationsforimageinstanceretrieval
AT veillardantoine groupinvariantdeeprepresentationsforimageinstanceretrieval
AT linjie groupinvariantdeeprepresentationsforimageinstanceretrieval
AT pettajulie groupinvariantdeeprepresentationsforimageinstanceretrieval
AT chandrasekharvijay groupinvariantdeeprepresentationsforimageinstanceretrieval
AT poggiotomaso groupinvariantdeeprepresentationsforimageinstanceretrieval