Few‐shot object detection via class encoding and multi‐target decoding

Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification a...

Full description

Bibliographic Details
Main Authors:	Xueqiang Guo, Hanqing Yang, Mohan Wei, Xiaotong Ye, Yu Zhang
Format:	Article
Language:	English
Published:	Wiley 2023-06-01
Series:	IET Cyber-systems and Robotics
Subjects:	Class Margin Few‐Shot Object Detection Multi‐Target Transformer
Online Access:	https://doi.org/10.1049/csy2.12088

_version_	1797796969235611648
author	Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang
author_facet	Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang
author_sort	Xueqiang Guo
collection	DOAJ
description	Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later.
first_indexed	2024-03-13T03:41:05Z
format	Article
id	doaj.art-f99edfa9c73d410da09d442db28eb907
institution	Directory Open Access Journal
issn	2631-6315
language	English
last_indexed	2024-03-13T03:41:05Z
publishDate	2023-06-01
publisher	Wiley
record_format	Article
series	IET Cyber-systems and Robotics
spelling	doaj.art-f99edfa9c73d410da09d442db28eb9072023-06-23T07:58:27ZengWileyIET Cyber-systems and Robotics2631-63152023-06-0152n/an/a10.1049/csy2.12088Few‐shot object detection via class encoding and multi‐target decodingXueqiang Guo0Hanqing Yang1Mohan Wei2Xiaotong Ye3Yu Zhang4State Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaAbstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later.https://doi.org/10.1049/csy2.12088Class MarginFew‐Shot Object DetectionMulti‐TargetTransformer
spellingShingle	Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang Few‐shot object detection via class encoding and multi‐target decoding IET Cyber-systems and Robotics Class Margin Few‐Shot Object Detection Multi‐Target Transformer
title	Few‐shot object detection via class encoding and multi‐target decoding
title_full	Few‐shot object detection via class encoding and multi‐target decoding
title_fullStr	Few‐shot object detection via class encoding and multi‐target decoding
title_full_unstemmed	Few‐shot object detection via class encoding and multi‐target decoding
title_short	Few‐shot object detection via class encoding and multi‐target decoding
title_sort	few shot object detection via class encoding and multi target decoding
topic	Class Margin Few‐Shot Object Detection Multi‐Target Transformer
url	https://doi.org/10.1049/csy2.12088
work_keys_str_mv	AT xueqiangguo fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT hanqingyang fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT mohanwei fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT xiaotongye fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT yuzhang fewshotobjectdetectionviaclassencodingandmultitargetdecoding

Few‐shot object detection via class encoding and multi‐target decoding

Similar Items