Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR

Abstract Content‐based image retrieval (CBIR) is the problem of searching for items in an image database that are similar to the query image. Most of the existing image retrieval methods are trained based on metric learning loss functions (e.g. contrastive loss or triplet loss), however, which requi...

Full description

Bibliographic Details
Main Authors: Jinliang Yao, Yongqing Li, Bing Yang, Chenrui Wang
Format: Article
Language:English
Published: Wiley 2023-07-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12825
_version_ 1827906902268313600
author Jinliang Yao
Yongqing Li
Bing Yang
Chenrui Wang
author_facet Jinliang Yao
Yongqing Li
Bing Yang
Chenrui Wang
author_sort Jinliang Yao
collection DOAJ
description Abstract Content‐based image retrieval (CBIR) is the problem of searching for items in an image database that are similar to the query image. Most of the existing image retrieval methods are trained based on metric learning loss functions (e.g. contrastive loss or triplet loss), however, which require the use of hard sample mining strategies (HMS) to better train the model. The HMS implies that picking out hard positive or negative samples increases the complexity of model training and requires a large amount of additional training time. To address this issue, lessons from recent work are leveraged on representation learning and a model called GS is proposed that combines the state‐of‐the‐art Generalized‐Mean (GeM) pooling and the smoothed average precision (AP). The entire network can be learned end‐to‐end by approximating the non‐differentiable AP function to a differentiable one‐without mining hard samples, only image‐level annotations. A model named GSA is also presented which achieves excellent retrieval performance jointly trained by two various loss functions. Experimental results validate the effectiveness of the proposed approach and demonstrate the competitive performance on a common standard image retrieval dataset (Revisited Oxford and Paris).
first_indexed 2024-03-13T01:04:01Z
format Article
id doaj.art-8635915b7425460e9a343e7d8bd05b8b
institution Directory Open Access Journal
issn 1751-9659
1751-9667
language English
last_indexed 2024-03-13T01:04:01Z
publishDate 2023-07-01
publisher Wiley
record_format Article
series IET Image Processing
spelling doaj.art-8635915b7425460e9a343e7d8bd05b8b2023-07-06T09:05:42ZengWileyIET Image Processing1751-96591751-96672023-07-011792748276310.1049/ipr2.12825Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIRJinliang Yao0Yongqing Li1Bing Yang2Chenrui Wang3School of Computer Science and Technology Hangzhou Dianzi University Hangzhou ChinaSchool of Computer Science and Technology Hangzhou Dianzi University Hangzhou ChinaSchool of Computer Science and Technology Hangzhou Dianzi University Hangzhou ChinaSchool of Computer Science and Technology Hangzhou Dianzi University Hangzhou ChinaAbstract Content‐based image retrieval (CBIR) is the problem of searching for items in an image database that are similar to the query image. Most of the existing image retrieval methods are trained based on metric learning loss functions (e.g. contrastive loss or triplet loss), however, which require the use of hard sample mining strategies (HMS) to better train the model. The HMS implies that picking out hard positive or negative samples increases the complexity of model training and requires a large amount of additional training time. To address this issue, lessons from recent work are leveraged on representation learning and a model called GS is proposed that combines the state‐of‐the‐art Generalized‐Mean (GeM) pooling and the smoothed average precision (AP). The entire network can be learned end‐to‐end by approximating the non‐differentiable AP function to a differentiable one‐without mining hard samples, only image‐level annotations. A model named GSA is also presented which achieves excellent retrieval performance jointly trained by two various loss functions. Experimental results validate the effectiveness of the proposed approach and demonstrate the competitive performance on a common standard image retrieval dataset (Revisited Oxford and Paris).https://doi.org/10.1049/ipr2.12825computer visioncontent‐based retrievalimage retrieval
spellingShingle Jinliang Yao
Yongqing Li
Bing Yang
Chenrui Wang
Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
IET Image Processing
computer vision
content‐based retrieval
image retrieval
title Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
title_full Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
title_fullStr Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
title_full_unstemmed Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
title_short Learning global image representation with generalized‐mean pooling and smoothed average precision for large‐scale CBIR
title_sort learning global image representation with generalized mean pooling and smoothed average precision for large scale cbir
topic computer vision
content‐based retrieval
image retrieval
url https://doi.org/10.1049/ipr2.12825
work_keys_str_mv AT jinliangyao learningglobalimagerepresentationwithgeneralizedmeanpoolingandsmoothedaverageprecisionforlargescalecbir
AT yongqingli learningglobalimagerepresentationwithgeneralizedmeanpoolingandsmoothedaverageprecisionforlargescalecbir
AT bingyang learningglobalimagerepresentationwithgeneralizedmeanpoolingandsmoothedaverageprecisionforlargescalecbir
AT chenruiwang learningglobalimagerepresentationwithgeneralizedmeanpoolingandsmoothedaverageprecisionforlargescalecbir