Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval

Along with a new volume of images containing valuable information about our past, the digitization of historical territorial imagery has brought the challenge of understanding and interconnecting collections with unique or rare representation characteristics, and sparse metadata. Content-based image...

Full description

Bibliographic Details
Main Authors: Dimitri Gominski, Valérie Gouet-Brunet, Liming Chen
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/13/16/3080
_version_ 1797522199328849920
author Dimitri Gominski
Valérie Gouet-Brunet
Liming Chen
author_facet Dimitri Gominski
Valérie Gouet-Brunet
Liming Chen
author_sort Dimitri Gominski
collection DOAJ
description Along with a new volume of images containing valuable information about our past, the digitization of historical territorial imagery has brought the challenge of understanding and interconnecting collections with unique or rare representation characteristics, and sparse metadata. Content-based image retrieval offers a promising solution in this context, by building links in the data without relying on human supervision. However, while the latest propositions in deep learning have shown impressive results in applications linked to feature learning, they often rely on the hypothesis that there exists a training dataset matching the use case. Increasing generalization and robustness to variations remains an open challenge, poorly understood in the context of real-world applications. Introducing the <span style="font-variant: small-caps;">alegoria</span> benchmark, containing multi-date vertical and oblique aerial digitized photography mixed with more modern street-level pictures, we formulate the problem of low-data, heterogeneous image retrieval, and propose associated evaluation setups and measures. We propose a review of ideas and methods to tackle this problem, extensively compare state-of-the-art descriptors and propose a new multi-descriptor diffusion method to exploit their comparative strengths. Our experiments highlight the benefits of combining descriptors and the compromise between absolute and cross-domain performance.
first_indexed 2024-03-10T08:26:06Z
format Article
id doaj.art-21de93e8a14f4203925e93da3f61fd12
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T08:26:06Z
publishDate 2021-08-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-21de93e8a14f4203925e93da3f61fd122023-11-22T09:31:29ZengMDPI AGRemote Sensing2072-42922021-08-011316308010.3390/rs13163080Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance RetrievalDimitri Gominski0Valérie Gouet-Brunet1Liming Chen2LaSTIG, IGN-ENSG, Gustave Eiffel University, 77420 Champs-sur-Marne, FranceLaSTIG, IGN-ENSG, Gustave Eiffel University, 77420 Champs-sur-Marne, FranceLIRIS, École Centrale de Lyon, 69134 Écully, FranceAlong with a new volume of images containing valuable information about our past, the digitization of historical territorial imagery has brought the challenge of understanding and interconnecting collections with unique or rare representation characteristics, and sparse metadata. Content-based image retrieval offers a promising solution in this context, by building links in the data without relying on human supervision. However, while the latest propositions in deep learning have shown impressive results in applications linked to feature learning, they often rely on the hypothesis that there exists a training dataset matching the use case. Increasing generalization and robustness to variations remains an open challenge, poorly understood in the context of real-world applications. Introducing the <span style="font-variant: small-caps;">alegoria</span> benchmark, containing multi-date vertical and oblique aerial digitized photography mixed with more modern street-level pictures, we formulate the problem of low-data, heterogeneous image retrieval, and propose associated evaluation setups and measures. We propose a review of ideas and methods to tackle this problem, extensively compare state-of-the-art descriptors and propose a new multi-descriptor diffusion method to exploit their comparative strengths. Our experiments highlight the benefits of combining descriptors and the compromise between absolute and cross-domain performance.https://www.mdpi.com/2072-4292/13/16/3080CBIRcross-domaincultural heritagebenchmarkingdiffusion
spellingShingle Dimitri Gominski
Valérie Gouet-Brunet
Liming Chen
Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
Remote Sensing
CBIR
cross-domain
cultural heritage
benchmarking
diffusion
title Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
title_full Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
title_fullStr Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
title_full_unstemmed Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
title_short Connecting Images through Sources: Exploring Low-Data, Heterogeneous Instance Retrieval
title_sort connecting images through sources exploring low data heterogeneous instance retrieval
topic CBIR
cross-domain
cultural heritage
benchmarking
diffusion
url https://www.mdpi.com/2072-4292/13/16/3080
work_keys_str_mv AT dimitrigominski connectingimagesthroughsourcesexploringlowdataheterogeneousinstanceretrieval
AT valeriegouetbrunet connectingimagesthroughsourcesexploringlowdataheterogeneousinstanceretrieval
AT limingchen connectingimagesthroughsourcesexploringlowdataheterogeneousinstanceretrieval