Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison
We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be fo...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/13/5863 |
_version_ | 1797528989988814848 |
---|---|
author | Paweł Piwowarski Włodzimierz Kasprzak |
author_facet | Paweł Piwowarski Włodzimierz Kasprzak |
author_sort | Paweł Piwowarski |
collection | DOAJ |
description | We consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate). |
first_indexed | 2024-03-10T10:06:04Z |
format | Article |
id | doaj.art-f5e12e3a686a41faa23e77668181d11e |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T10:06:04Z |
publishDate | 2021-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-f5e12e3a686a41faa23e77668181d11e2023-11-22T01:32:24ZengMDPI AGApplied Sciences2076-34172021-06-011113586310.3390/app11135863Evaluation of Multi-Stream Fusion for Multi-View Image Set ComparisonPaweł Piwowarski0Włodzimierz Kasprzak1Institute of Control and Computation Engineering, Warsaw University of Technology, 00-665 Warsaw, PolandInstitute of Control and Computation Engineering, Warsaw University of Technology, 00-665 Warsaw, PolandWe consider the problem of image set comparison, i.e., to determine whether two image sets show the same unique object (approximately) from the same viewpoints. Our proposition is to solve it by a multi-stream fusion of several image recognition paths. Immediate applications of this method can be found in fraud detection, deduplication procedure, or visual searching. The contribution of this paper is a novel distance measure for similarity of image sets and the experimental evaluation of several streams for the considered problem of same-car image set recognition. To determine a similarity score of image sets (this score expresses the certainty level that both sets represent the same object visible from the same set of views), we adapted a measure commonly applied in blind signal separation (BSS) evaluation. This measure is independent of the number of images in a set and the order of views in it. Separate streams for object classification (where a class represents either a car type or a car model-and-view) and object-to-object similarity evaluation (based on object features obtained alternatively by the convolutional neural network (CNN) or image keypoint descriptors) were designed. A late fusion by a fully-connected neural network (NN) completes the solution. The implementation is of modular structure—for semantic segmentation we use a Mask-RCNN (Mask regions with CNN features) with ResNet 101 as a backbone network; image feature extraction is either based on the DeepRanking neural network or classic keypoint descriptors (e.g., scale-invariant feature transform (SIFT)) and object classification is performed by two Inception V3 deep networks trained for car type-and-view and car model-and-view classification (4 views, 9 car types, and 197 car models are considered). Experiments conducted on the Stanford Cars dataset led to selection of the best system configuration that overperforms a base approach, allowing for a 67.7% GAR (genuine acceptance rate) at 3% FAR (false acceptance rate).https://www.mdpi.com/2076-3417/11/13/5863image setsset similarity metricsame object verificationsame view verificationDeepRankingBSS error index |
spellingShingle | Paweł Piwowarski Włodzimierz Kasprzak Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison Applied Sciences image sets set similarity metric same object verification same view verification DeepRanking BSS error index |
title | Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison |
title_full | Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison |
title_fullStr | Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison |
title_full_unstemmed | Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison |
title_short | Evaluation of Multi-Stream Fusion for Multi-View Image Set Comparison |
title_sort | evaluation of multi stream fusion for multi view image set comparison |
topic | image sets set similarity metric same object verification same view verification DeepRanking BSS error index |
url | https://www.mdpi.com/2076-3417/11/13/5863 |
work_keys_str_mv | AT pawełpiwowarski evaluationofmultistreamfusionformultiviewimagesetcomparison AT włodzimierzkasprzak evaluationofmultistreamfusionformultiviewimagesetcomparison |