Transformed Dynamic Feature Pyramid for Small Object Detection
The low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and globa...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9551884/ |
_version_ | 1818669401636864000 |
---|---|
author | Hong Liang Ying Yang Qian Zhang Linxia Feng Jie Ren Qiyao Liang |
author_facet | Hong Liang Ying Yang Qian Zhang Linxia Feng Jie Ren Qiyao Liang |
author_sort | Hong Liang |
collection | DOAJ |
description | The low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and global context to establish the connection between features. In the proposed transformed dynamic feature pyramid network, a transformer module was designed to dynamically transform and fuse the multi-scale features generated by the backbone to generate a transformed feature pyramid with richer multi-scale features and context information. In this transformation process, gate block is used to dynamically select single-scale transformation or cross-scale transformation to achieve an optimal style of transformation and fusion of multi-scale features. The experimental results show that the model improves the small targets detection accuracy based on CNN and transformer. Based on the backbone ResNeXt-101, TDFP achieves 46.2% AP and 26.3% AP<sub>S</sub> on MS COCO, and takes the amount of computation as a loss constraint to achieve a better balance between detection accuracy and computational complexity. |
first_indexed | 2024-12-17T06:51:38Z |
format | Article |
id | doaj.art-7a3949bbfa86479c952f56dbc4f85169 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T06:51:38Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-7a3949bbfa86479c952f56dbc4f851692022-12-21T21:59:35ZengIEEEIEEE Access2169-35362021-01-01913464913465910.1109/ACCESS.2021.31163249551884Transformed Dynamic Feature Pyramid for Small Object DetectionHong Liang0Ying Yang1https://orcid.org/0000-0002-5972-5239Qian Zhang2Linxia Feng3Jie Ren4Qiyao Liang5College of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaCollege of Computer Science and Technology, China University of Petroleum (East China), Shandong, Qingdao, ChinaThe low resolution and less feature information of small targets make it difficult to recognize and locate, which greatly hinders the improvement of object detection accuracy. In this paper, an object detection model (TDFP) based on CNN and transformer was established, which combines local and global context to establish the connection between features. In the proposed transformed dynamic feature pyramid network, a transformer module was designed to dynamically transform and fuse the multi-scale features generated by the backbone to generate a transformed feature pyramid with richer multi-scale features and context information. In this transformation process, gate block is used to dynamically select single-scale transformation or cross-scale transformation to achieve an optimal style of transformation and fusion of multi-scale features. The experimental results show that the model improves the small targets detection accuracy based on CNN and transformer. Based on the backbone ResNeXt-101, TDFP achieves 46.2% AP and 26.3% AP<sub>S</sub> on MS COCO, and takes the amount of computation as a loss constraint to achieve a better balance between detection accuracy and computational complexity.https://ieeexplore.ieee.org/document/9551884/Local and global context informationtransformer moduletransformed feature pyramidsingle-scale transformationcross-scale transformation |
spellingShingle | Hong Liang Ying Yang Qian Zhang Linxia Feng Jie Ren Qiyao Liang Transformed Dynamic Feature Pyramid for Small Object Detection IEEE Access Local and global context information transformer module transformed feature pyramid single-scale transformation cross-scale transformation |
title | Transformed Dynamic Feature Pyramid for Small Object Detection |
title_full | Transformed Dynamic Feature Pyramid for Small Object Detection |
title_fullStr | Transformed Dynamic Feature Pyramid for Small Object Detection |
title_full_unstemmed | Transformed Dynamic Feature Pyramid for Small Object Detection |
title_short | Transformed Dynamic Feature Pyramid for Small Object Detection |
title_sort | transformed dynamic feature pyramid for small object detection |
topic | Local and global context information transformer module transformed feature pyramid single-scale transformation cross-scale transformation |
url | https://ieeexplore.ieee.org/document/9551884/ |
work_keys_str_mv | AT hongliang transformeddynamicfeaturepyramidforsmallobjectdetection AT yingyang transformeddynamicfeaturepyramidforsmallobjectdetection AT qianzhang transformeddynamicfeaturepyramidforsmallobjectdetection AT linxiafeng transformeddynamicfeaturepyramidforsmallobjectdetection AT jieren transformeddynamicfeaturepyramidforsmallobjectdetection AT qiyaoliang transformeddynamicfeaturepyramidforsmallobjectdetection |