Transformed Dynamic Feature Pyramid for Small Object Detection

The low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and globa...

Full description

Bibliographic Details
Main Authors: Hong Liang, Ying Yang, Qian Zhang, Linxia Feng, Jie Ren, Qiyao Liang
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9551884/
_version_ 1818669401636864000
author Hong Liang
Ying Yang
Qian Zhang
Linxia Feng
Jie Ren
Qiyao Liang
author_facet Hong Liang
Ying Yang
Qian Zhang
Linxia Feng
Jie Ren
Qiyao Liang
author_sort Hong Liang
collection DOAJ
description The low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and global context to establish the connection between features. In the proposed transformed dynamic feature pyramid network, a transformer module was designed to dynamically transform and fuse the multi-scale features generated by the backbone to generate a transformed feature pyramid with richer multi-scale features and context information. In this transformation process, gate block is used to dynamically select single-scale transformation or cross-scale transformation to achieve an optimal style of transformation and fusion of multi-scale features. The experimental results show that the model improves the small targets detection accuracy based on CNN and transformer. Based on the backbone ResNeXt-101, TDFP achieves 46.2&#x0025; AP and 26.3&#x0025; AP<sub>S</sub> on MS COCO, and takes the amount of computation as a loss constraint to achieve a better balance between detection accuracy and computational complexity.
first_indexed 2024-12-17T06:51:38Z
format Article
id doaj.art-7a3949bbfa86479c952f56dbc4f85169
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-17T06:51:38Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-7a3949bbfa86479c952f56dbc4f851692022-12-21T21:59:35ZengIEEEIEEE Access2169-35362021-01-01913464913465910.1109/ACCESS.2021.31163249551884Transformed Dynamic Feature Pyramid for Small Object DetectionHong Liang0Ying Yang1https://orcid.org/0000-0002-5972-5239Qian Zhang2Linxia Feng3Jie Ren4Qiyao Liang5College of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaThe low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and global context to establish the connection between features. In the proposed transformed dynamic feature pyramid network, a transformer module was designed to dynamically transform and fuse the multi-scale features generated by the backbone to generate a transformed feature pyramid with richer multi-scale features and context information. In this transformation process, gate block is used to dynamically select single-scale transformation or cross-scale transformation to achieve an optimal style of transformation and fusion of multi-scale features. The experimental results show that the model improves the small targets detection accuracy based on CNN and transformer. Based on the backbone ResNeXt-101, TDFP achieves 46.2&#x0025; AP and 26.3&#x0025; AP<sub>S</sub> on MS COCO, and takes the amount of computation as a loss constraint to achieve a better balance between detection accuracy and computational complexity.https://ieeexplore.ieee.org/document/9551884/Local and global context informationtransformer moduletransformed feature pyramidsingle-scale transformationcross-scale transformation
spellingShingle Hong Liang
Ying Yang
Qian Zhang
Linxia Feng
Jie Ren
Qiyao Liang
Transformed Dynamic Feature Pyramid for Small Object Detection
IEEE Access
Local and global context information
transformer module
transformed feature pyramid
single-scale transformation
cross-scale transformation
title Transformed Dynamic Feature Pyramid for Small Object Detection
title_full Transformed Dynamic Feature Pyramid for Small Object Detection
title_fullStr Transformed Dynamic Feature Pyramid for Small Object Detection
title_full_unstemmed Transformed Dynamic Feature Pyramid for Small Object Detection
title_short Transformed Dynamic Feature Pyramid for Small Object Detection
title_sort transformed dynamic feature pyramid for small object detection
topic Local and global context information
transformer module
transformed feature pyramid
single-scale transformation
cross-scale transformation
url https://ieeexplore.ieee.org/document/9551884/
work_keys_str_mv AT hongliang transformeddynamicfeaturepyramidforsmallobjectdetection
AT yingyang transformeddynamicfeaturepyramidforsmallobjectdetection
AT qianzhang transformeddynamicfeaturepyramidforsmallobjectdetection
AT linxiafeng transformeddynamicfeaturepyramidforsmallobjectdetection
AT jieren transformeddynamicfeaturepyramidforsmallobjectdetection
AT qiyaoliang transformeddynamicfeaturepyramidforsmallobjectdetection