Few‐shot object detection via class encoding and multi‐target decoding

Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification a...

Full description

Bibliographic Details
Main Authors: Xueqiang Guo, Hanqing Yang, Mohan Wei, Xiaotong Ye, Yu Zhang
Format: Article
Language:English
Published: Wiley 2023-06-01
Series:IET Cyber-systems and Robotics
Subjects:
Online Access:https://doi.org/10.1049/csy2.12088
_version_ 1797796969235611648
author Xueqiang Guo
Hanqing Yang
Mohan Wei
Xiaotong Ye
Yu Zhang
author_facet Xueqiang Guo
Hanqing Yang
Mohan Wei
Xiaotong Ye
Yu Zhang
author_sort Xueqiang Guo
collection DOAJ
description Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later.
first_indexed 2024-03-13T03:41:05Z
format Article
id doaj.art-f99edfa9c73d410da09d442db28eb907
institution Directory Open Access Journal
issn 2631-6315
language English
last_indexed 2024-03-13T03:41:05Z
publishDate 2023-06-01
publisher Wiley
record_format Article
series IET Cyber-systems and Robotics
spelling doaj.art-f99edfa9c73d410da09d442db28eb9072023-06-23T07:58:27ZengWileyIET Cyber-systems and Robotics2631-63152023-06-0152n/an/a10.1049/csy2.12088Few‐shot object detection via class encoding and multi‐target decodingXueqiang Guo0Hanqing Yang1Mohan Wei2Xiaotong Ye3Yu Zhang4State Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaAbstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later.https://doi.org/10.1049/csy2.12088Class MarginFew‐Shot Object DetectionMulti‐TargetTransformer
spellingShingle Xueqiang Guo
Hanqing Yang
Mohan Wei
Xiaotong Ye
Yu Zhang
Few‐shot object detection via class encoding and multi‐target decoding
IET Cyber-systems and Robotics
Class Margin
Few‐Shot Object Detection
Multi‐Target
Transformer
title Few‐shot object detection via class encoding and multi‐target decoding
title_full Few‐shot object detection via class encoding and multi‐target decoding
title_fullStr Few‐shot object detection via class encoding and multi‐target decoding
title_full_unstemmed Few‐shot object detection via class encoding and multi‐target decoding
title_short Few‐shot object detection via class encoding and multi‐target decoding
title_sort few shot object detection via class encoding and multi target decoding
topic Class Margin
Few‐Shot Object Detection
Multi‐Target
Transformer
url https://doi.org/10.1049/csy2.12088
work_keys_str_mv AT xueqiangguo fewshotobjectdetectionviaclassencodingandmultitargetdecoding
AT hanqingyang fewshotobjectdetectionviaclassencodingandmultitargetdecoding
AT mohanwei fewshotobjectdetectionviaclassencodingandmultitargetdecoding
AT xiaotongye fewshotobjectdetectionviaclassencodingandmultitargetdecoding
AT yuzhang fewshotobjectdetectionviaclassencodingandmultitargetdecoding