Group Invariant Deep Representations for Image Instance Retrieval
Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining...
Main Authors: | , , , , , |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM)
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/100796 |
_version_ | 1826198277229379584 |
---|---|
author | Morère, Olivier Veillard, Antoine Lin, Jie Petta, Julie Chandrasekhar, Vijay Poggio, Tomaso |
author_facet | Morère, Olivier Veillard, Antoine Lin, Jie Petta, Julie Chandrasekhar, Vijay Poggio, Tomaso |
author_sort | Morère, Olivier |
collection | MIT |
description | Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts.
In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness.
Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to over fitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-theart compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step –quantization or hashing– may be able to further improve the competitiveness of the descriptors. |
first_indexed | 2024-09-23T11:02:21Z |
format | Technical Report |
id | mit-1721.1/100796 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T11:02:21Z |
publishDate | 2016 |
publisher | Center for Brains, Minds and Machines (CBMM) |
record_format | dspace |
spelling | mit-1721.1/1007962019-04-11T13:29:43Z Group Invariant Deep Representations for Image Instance Retrieval Morère, Olivier Veillard, Antoine Lin, Jie Petta, Julie Chandrasekhar, Vijay Poggio, Tomaso Global Image Descriptors Convolutional Neural Networks (CNN) Most image instance retrieval pipelines are based on comparison of vectors known as global image descriptors between a query image and the database images. Due to their success in large scale image classification, representations extracted from Convolutional Neural Networks (CNN) are quickly gaining ground on Fisher Vectors (FVs) as state-of-the-art global descriptors for image instance retrieval. While CNN-based descriptors are generally remarked for good retrieval performance at lower bitrates, they nevertheless present a number of drawbacks including the lack of robustness to common object transformations such as rotations compared with their interest point based FV counterparts. In this paper, we propose a method for computing invariant global descriptors from CNNs. Our method implements a recently proposed mathematical theory for invariance in a sensory cortex modeled as a feedforward neural network. The resulting global descriptors can be made invariant to multiple arbitrary transformation groups while retaining good discriminativeness. Based on a thorough empirical evaluation using several publicly available datasets, we show that our method is able to significantly and consistently improve retrieval results every time a new type of invariance is incorporated. We also show that our method which has few parameters is not prone to over fitting: improvements generalize well across datasets with different properties with regard to invariances. Finally, we show that our descriptors are able to compare favourably to other state-of-theart compact descriptors in similar bitranges, exceeding the highest retrieval results reported in the literature on some datasets. A dedicated dimensionality reduction step –quantization or hashing– may be able to further improve the competitiveness of the descriptors. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216. 2016-01-12T01:09:23Z 2016-01-12T01:09:23Z 2016-01-11 Technical Report Working Paper http://hdl.handle.net/1721.1/100796 en_US CBMM Memo Series;043 Attribution-NonCommercial 3.0 United States http://creativecommons.org/licenses/by-nc/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM) |
spellingShingle | Global Image Descriptors Convolutional Neural Networks (CNN) Morère, Olivier Veillard, Antoine Lin, Jie Petta, Julie Chandrasekhar, Vijay Poggio, Tomaso Group Invariant Deep Representations for Image Instance Retrieval |
title | Group Invariant Deep Representations for Image Instance Retrieval |
title_full | Group Invariant Deep Representations for Image Instance Retrieval |
title_fullStr | Group Invariant Deep Representations for Image Instance Retrieval |
title_full_unstemmed | Group Invariant Deep Representations for Image Instance Retrieval |
title_short | Group Invariant Deep Representations for Image Instance Retrieval |
title_sort | group invariant deep representations for image instance retrieval |
topic | Global Image Descriptors Convolutional Neural Networks (CNN) |
url | http://hdl.handle.net/1721.1/100796 |
work_keys_str_mv | AT morereolivier groupinvariantdeeprepresentationsforimageinstanceretrieval AT veillardantoine groupinvariantdeeprepresentationsforimageinstanceretrieval AT linjie groupinvariantdeeprepresentationsforimageinstanceretrieval AT pettajulie groupinvariantdeeprepresentationsforimageinstanceretrieval AT chandrasekharvijay groupinvariantdeeprepresentationsforimageinstanceretrieval AT poggiotomaso groupinvariantdeeprepresentationsforimageinstanceretrieval |