Few‐shot object detection via class encoding and multi‐target decoding
Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification a...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-06-01
|
Series: | IET Cyber-systems and Robotics |
Subjects: | |
Online Access: | https://doi.org/10.1049/csy2.12088 |
_version_ | 1797796969235611648 |
---|---|
author | Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang |
author_facet | Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang |
author_sort | Xueqiang Guo |
collection | DOAJ |
description | Abstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later. |
first_indexed | 2024-03-13T03:41:05Z |
format | Article |
id | doaj.art-f99edfa9c73d410da09d442db28eb907 |
institution | Directory Open Access Journal |
issn | 2631-6315 |
language | English |
last_indexed | 2024-03-13T03:41:05Z |
publishDate | 2023-06-01 |
publisher | Wiley |
record_format | Article |
series | IET Cyber-systems and Robotics |
spelling | doaj.art-f99edfa9c73d410da09d442db28eb9072023-06-23T07:58:27ZengWileyIET Cyber-systems and Robotics2631-63152023-06-0152n/an/a10.1049/csy2.12088Few‐shot object detection via class encoding and multi‐target decodingXueqiang Guo0Hanqing Yang1Mohan Wei2Xiaotong Ye3Yu Zhang4State Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaState Key Laboratory of Industrial Control Technology College of Control Science and Engineering Zhejiang University Hangzhou ChinaAbstract The task of few‐shot object detection is to classify and locate objects through a few annotated samples. Although many studies have tried to solve this problem, the results are still not satisfactory. Recent studies have found that the class margin significantly impacts the classification and representation of the targets to be detected. Most methods use the loss function to balance the class margin, but the results show that the loss‐based methods only have a tiny improvement on the few‐shot object detection problem. In this study, the authors propose a class encoding method based on the transformer to balance the class margin, which can make the model pay more attention to the essential information of the features, thus increasing the recognition ability of the sample. Besides, the authors propose a multi‐target decoding method to aggregate RoI vectors generated from multi‐target images with multiple support vectors, which can significantly improve the detection ability of the detector for multi‐target images. Experiments on Pascal visual object classes (VOC) and Microsoft Common Objects in Context datasets show that our proposed Few‐Shot Object Detection via Class Encoding and Multi‐Target Decoding significantly improves upon baseline detectors (average accuracy improvement is up to 10.8% on VOC and 2.1% on COCO), achieving competitive performance. In general, we propose a new way to regulate the class margin between support set vectors and a way of feature aggregation for images containing multiple objects and achieve remarkable results. Our method is implemented on mmfewshot, and the code will be available later.https://doi.org/10.1049/csy2.12088Class MarginFew‐Shot Object DetectionMulti‐TargetTransformer |
spellingShingle | Xueqiang Guo Hanqing Yang Mohan Wei Xiaotong Ye Yu Zhang Few‐shot object detection via class encoding and multi‐target decoding IET Cyber-systems and Robotics Class Margin Few‐Shot Object Detection Multi‐Target Transformer |
title | Few‐shot object detection via class encoding and multi‐target decoding |
title_full | Few‐shot object detection via class encoding and multi‐target decoding |
title_fullStr | Few‐shot object detection via class encoding and multi‐target decoding |
title_full_unstemmed | Few‐shot object detection via class encoding and multi‐target decoding |
title_short | Few‐shot object detection via class encoding and multi‐target decoding |
title_sort | few shot object detection via class encoding and multi target decoding |
topic | Class Margin Few‐Shot Object Detection Multi‐Target Transformer |
url | https://doi.org/10.1049/csy2.12088 |
work_keys_str_mv | AT xueqiangguo fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT hanqingyang fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT mohanwei fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT xiaotongye fewshotobjectdetectionviaclassencodingandmultitargetdecoding AT yuzhang fewshotobjectdetectionviaclassencodingandmultitargetdecoding |