Eliminating feature ambiguity for few-shot segmentation
Recent advancements in few-shot segmentation (FSS) have exploited pixel-by-pixel matching between query and support features, typically based on cross attention, which selectively activate query foreground (FG) features that correspond to the same-class support FG features. However, due to the la...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Conference Paper |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/180247 http://arxiv.org/abs/2407.09842v1 |
Summary: | Recent advancements in few-shot segmentation (FSS) have exploited
pixel-by-pixel matching between query and support features, typically based on
cross attention, which selectively activate query foreground (FG) features that
correspond to the same-class support FG features. However, due to the large
receptive fields in deep layers of the backbone, the extracted query and
support FG features are inevitably mingled with background (BG) features,
impeding the FG-FG matching in cross attention. Hence, the query FG features
are fused with less support FG features, i.e., the support information is not
well utilized. This paper presents a novel plug-in termed ambiguity elimination
network (AENet), which can be plugged into any existing cross attention-based
FSS methods. The main idea is to mine discriminative query FG regions to
rectify the ambiguous FG features, increasing the proportion of FG information,
so as to suppress the negative impacts of the doped BG features. In this way,
the FG-FG matching is naturally enhanced. We plug AENet into three baselines
CyCTR, SCCAN and HDMNet for evaluation, and their scores are improved by large
margins, e.g., the 1-shot performance of SCCAN can be improved by 3.0%+ on both
PASCAL-5$^i$ and COCO-20$^i$. The code is available at
https://github.com/Sam1224/AENet. |
---|