Redefining prior feature space via finetuning a triplet network for few‐shot learning

Abstract Few‐shot learning is to distinguish novel concepts with few annotated data, which has attracted much attention due to its requirement of limited training data for target classes. Recent few‐shot learning methods usually pretrain a feature extractor with images from the base set to boost the...

Full description

Bibliographic Details
Main Authors: Jiaying Wu, Jinglu Hu
Format: Article
Language:English
Published: Wiley 2022-09-01
Series:IET Computer Vision
Subjects:
Online Access:https://doi.org/10.1049/cvi2.12109
Description
Summary:Abstract Few‐shot learning is to distinguish novel concepts with few annotated data, which has attracted much attention due to its requirement of limited training data for target classes. Recent few‐shot learning methods usually pretrain a feature extractor with images from the base set to boost the performance of few‐shot tasks and classify novel categories in this prior feature space. However, it is difficult for the pretrained feature extractor to extract accurate representations for novel categories, resulting in large amounts of overlapping areas between new classes. To address these issues, the prior feature space with a triplet network to learn a more discriminative space is refined, where features belonging to same class are pulled together and that from different classes are pushed apart. Specifically, the authors first follow recent paradigm of pretraining to obtain a prior feature space. Then, a triplet network with contrastive learning is trained to project the features from this space into a low‐dimensional one. The main difference lies in that the authors’ model is based on Maximum A Posteriori (MAP) and the triplet network with hallucinated features is finetuned from it to make them generalise well to novel categories. Finally, the authors conduct classification tasks in the finetuned space. The authors’ intuition is that the overlapping areas in novel categories can be separated by finetuning the triplet network pretrained on base set with contrastive learning. Experimental results on four few‐shot benchmarks show that it significantly outperforms the baseline methods, improves around 1.09% ∼ 13.09% than the best results in each dataset on both 1‐ and 5‐shot tasks.
ISSN:1751-9632
1751-9640