Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison

We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be fo...

Full description

Bibliographic Details
Main Authors: Paweł Piwowarski, Włodzimierz Kasprzak
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/13/5863
_version_ 1797528989988814848
author Paweł Piwowarski
Włodzimierz Kasprzak
author_facet Paweł Piwowarski
Włodzimierz Kasprzak
author_sort Paweł Piwowarski
collection DOAJ
description We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate).
first_indexed 2024-03-10T10:06:04Z
format Article
id doaj.art-f5e12e3a686a41faa23e77668181d11e
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T10:06:04Z
publishDate 2021-06-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-f5e12e3a686a41faa23e77668181d11e2023-11-22T01:32:24ZengMDPI AGApplied Sciences2076-34172021-06-011113586310.3390/app11135863Evaluation of Multi-Stream Fusion for Multi-View Image Set ComparisonPaweł Piwowarski0Włodzimierz Kasprzak1Institute of Control and Computation Engineering, Warsaw University of Technology, 00-665 Warsaw, PolandInstitute of Control and Computation Engineering, Warsaw University of Technology, 00-665 Warsaw, PolandWe consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate).https://www.mdpi.com/2076-3417/11/13/5863image setsset similarity metricsame object verificationsame view verificationDeepRankingBSS error index
spellingShingle Paweł Piwowarski
Włodzimierz Kasprzak
Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
Applied Sciences
image sets
set similarity metric
same object verification
same view verification
DeepRanking
BSS error index
title Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
title_full Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
title_fullStr Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
title_full_unstemmed Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
title_short Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
title_sort evaluation of multi stream fusion for multi view image set comparison
topic image sets
set similarity metric
same object verification
same view verification
DeepRanking
BSS error index
url https://www.mdpi.com/2076-3417/11/13/5863
work_keys_str_mv AT pawełpiwowarski evaluationofmultistreamfusionformultiviewimagesetcomparison
AT włodzimierzkasprzak evaluationofmultistreamfusionformultiviewimagesetcomparison