Zero-Shot Image Classification Based on a Learnable Deep Metric

The supervised model based on deep learning has made great achievements in the field of image classification after training with a large number of labeled samples. However, there are many categories without or only with a few labeled training samples in practice, and some categories even have no tra...

Full description

Bibliographic Details
Main Authors: Jingyi Liu, Caijuan Shi, Dongjing Tu, Ze Shi, Yazhi Liu
Format: Article
Language:English
Published: MDPI AG 2021-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/9/3241
_version_ 1797534996503724032
author Jingyi Liu
Caijuan Shi
Dongjing Tu
Ze Shi
Yazhi Liu
author_facet Jingyi Liu
Caijuan Shi
Dongjing Tu
Ze Shi
Yazhi Liu
author_sort Jingyi Liu
collection DOAJ
description The supervised model based on deep learning has made great achievements in the field of image classification after training with a large number of labeled samples. However, there are many categories without or only with a few labeled training samples in practice, and some categories even have no training samples at all. The proposed zero-shot learning greatly reduces the dependence on labeled training samples for image classification models. Nevertheless, there are limitations in learning the similarity of visual features and semantic features with a predefined fixed metric (e.g., as Euclidean distance), as well as the problem of semantic gap in the mapping process. To address these problems, a new zero-shot image classification method based on an end-to-end learnable deep metric is proposed in this paper. First, the common space embedding is adopted to map the visual features and semantic features into a common space. Second, an end-to-end learnable deep metric, that is, the relation network is utilized to learn the similarity of visual features and semantic features. Finally, the invisible images are classified, according to the similarity score. Extensive experiments are carried out on four datasets and the results indicate the effectiveness of the proposed method.
first_indexed 2024-03-10T11:38:20Z
format Article
id doaj.art-2eccfe51db3440bd98148b109285bfc5
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T11:38:20Z
publishDate 2021-05-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-2eccfe51db3440bd98148b109285bfc52023-11-21T18:41:44ZengMDPI AGSensors1424-82202021-05-01219324110.3390/s21093241Zero-Shot Image Classification Based on a Learnable Deep MetricJingyi Liu0Caijuan Shi1Dongjing Tu2Ze Shi3Yazhi Liu4College of Information Engineering, North China University of Science and Technology, Tangshan 063210, ChinaCollege of Information Engineering, North China University of Science and Technology, Tangshan 063210, ChinaCollege of Information Engineering, North China University of Science and Technology, Tangshan 063210, ChinaCollege of Information Engineering, North China University of Science and Technology, Tangshan 063210, ChinaCollege of Information Engineering, North China University of Science and Technology, Tangshan 063210, ChinaThe supervised model based on deep learning has made great achievements in the field of image classification after training with a large number of labeled samples. However, there are many categories without or only with a few labeled training samples in practice, and some categories even have no training samples at all. The proposed zero-shot learning greatly reduces the dependence on labeled training samples for image classification models. Nevertheless, there are limitations in learning the similarity of visual features and semantic features with a predefined fixed metric (e.g., as Euclidean distance), as well as the problem of semantic gap in the mapping process. To address these problems, a new zero-shot image classification method based on an end-to-end learnable deep metric is proposed in this paper. First, the common space embedding is adopted to map the visual features and semantic features into a common space. Second, an end-to-end learnable deep metric, that is, the relation network is utilized to learn the similarity of visual features and semantic features. Finally, the invisible images are classified, according to the similarity score. Extensive experiments are carried out on four datasets and the results indicate the effectiveness of the proposed method.https://www.mdpi.com/1424-8220/21/9/3241zero-shot learningdeep metriccommon space embeddingrelation networkimage classificationdeep learning
spellingShingle Jingyi Liu
Caijuan Shi
Dongjing Tu
Ze Shi
Yazhi Liu
Zero-Shot Image Classification Based on a Learnable Deep Metric
Sensors
zero-shot learning
deep metric
common space embedding
relation network
image classification
deep learning
title Zero-Shot Image Classification Based on a Learnable Deep Metric
title_full Zero-Shot Image Classification Based on a Learnable Deep Metric
title_fullStr Zero-Shot Image Classification Based on a Learnable Deep Metric
title_full_unstemmed Zero-Shot Image Classification Based on a Learnable Deep Metric
title_short Zero-Shot Image Classification Based on a Learnable Deep Metric
title_sort zero shot image classification based on a learnable deep metric
topic zero-shot learning
deep metric
common space embedding
relation network
image classification
deep learning
url https://www.mdpi.com/1424-8220/21/9/3241
work_keys_str_mv AT jingyiliu zeroshotimageclassificationbasedonalearnabledeepmetric
AT caijuanshi zeroshotimageclassificationbasedonalearnabledeepmetric
AT dongjingtu zeroshotimageclassificationbasedonalearnabledeepmetric
AT zeshi zeroshotimageclassificationbasedonalearnabledeepmetric
AT yazhiliu zeroshotimageclassificationbasedonalearnabledeepmetric