Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
Abstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal per...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-12-01
|
Series: | IET Image Processing |
Subjects: | |
Online Access: | https://doi.org/10.1049/ipr2.12912 |
_version_ | 1797404533186363392 |
---|---|
author | Yafeng Zhang Junyang Yu Yuanyuan Wang Shuang Tang Han Li Zhiyi Xin Chaoyi Wang Ziming Zhao |
author_facet | Yafeng Zhang Junyang Yu Yuanyuan Wang Shuang Tang Han Li Zhiyi Xin Chaoyi Wang Ziming Zhao |
author_sort | Yafeng Zhang |
collection | DOAJ |
description | Abstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal performance due to various factors such as small target size, complex background, variability in illumination, occlusions, and target distortion. Here, a small object detection method for complex traffic scenarios named deformable local and global attention (DLGADet) is proposed, which seamlessly merges the ability of hierarchical attention mechanisms (HAMs) with the versatility of deformable multi‐scale feature fusion, effectively improving recognition and detection performance. First, DLGADet introduces the combination of multi‐scale separable detection and multi‐scale feature fusion mechanism to obtain richer contextual information for feature fusion while solving the misalignment problem of classification and localisation tasks. Second, a deformation feature extraction module (DFEM) is designed to address the deformation of objects. Finally, a HAM combining global and local attention mechanisms is designed to obtain discriminative features from complex backgrounds. Extensive experiments on three datasets demonstrate the effectiveness of the proposed methods. Code is available at https://github.com/ACAMPUS/DLGADet |
first_indexed | 2024-03-09T02:56:22Z |
format | Article |
id | doaj.art-b689c7317a3c4012a7359a036a13d572 |
institution | Directory Open Access Journal |
issn | 1751-9659 1751-9667 |
language | English |
last_indexed | 2024-03-09T02:56:22Z |
publishDate | 2023-12-01 |
publisher | Wiley |
record_format | Article |
series | IET Image Processing |
spelling | doaj.art-b689c7317a3c4012a7359a036a13d5722023-12-05T06:22:51ZengWileyIET Image Processing1751-96591751-96672023-12-0117143986399910.1049/ipr2.12912Small object detection based on hierarchical attention mechanism and multi‐scale separable detectionYafeng Zhang0Junyang Yu1Yuanyuan Wang2Shuang Tang3Han Li4Zhiyi Xin5Chaoyi Wang6Ziming Zhao7School of Software Henan University Kaifeng Henan ChinaSchool of Software Henan University Kaifeng Henan ChinaEconomic and Technical Research Institute State Grid Henan Province Electric Power Company Zhengzhou Henan ChinaSchool of Economics Henan University Kaifeng Henan ChinaSchool of Economics Henan University Kaifeng Henan ChinaSchool of Software Henan University Kaifeng Henan ChinaElectrical and Computer Engineering Shanghai Institute of Microsystem and Information Technology Shanghai ChinaSchool of Software Henan University Kaifeng Henan ChinaAbstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal performance due to various factors such as small target size, complex background, variability in illumination, occlusions, and target distortion. Here, a small object detection method for complex traffic scenarios named deformable local and global attention (DLGADet) is proposed, which seamlessly merges the ability of hierarchical attention mechanisms (HAMs) with the versatility of deformable multi‐scale feature fusion, effectively improving recognition and detection performance. First, DLGADet introduces the combination of multi‐scale separable detection and multi‐scale feature fusion mechanism to obtain richer contextual information for feature fusion while solving the misalignment problem of classification and localisation tasks. Second, a deformation feature extraction module (DFEM) is designed to address the deformation of objects. Finally, a HAM combining global and local attention mechanisms is designed to obtain discriminative features from complex backgrounds. Extensive experiments on three datasets demonstrate the effectiveness of the proposed methods. Code is available at https://github.com/ACAMPUS/DLGADethttps://doi.org/10.1049/ipr2.12912image recognitionobject detection |
spellingShingle | Yafeng Zhang Junyang Yu Yuanyuan Wang Shuang Tang Han Li Zhiyi Xin Chaoyi Wang Ziming Zhao Small object detection based on hierarchical attention mechanism and multi‐scale separable detection IET Image Processing image recognition object detection |
title | Small object detection based on hierarchical attention mechanism and multi‐scale separable detection |
title_full | Small object detection based on hierarchical attention mechanism and multi‐scale separable detection |
title_fullStr | Small object detection based on hierarchical attention mechanism and multi‐scale separable detection |
title_full_unstemmed | Small object detection based on hierarchical attention mechanism and multi‐scale separable detection |
title_short | Small object detection based on hierarchical attention mechanism and multi‐scale separable detection |
title_sort | small object detection based on hierarchical attention mechanism and multi scale separable detection |
topic | image recognition object detection |
url | https://doi.org/10.1049/ipr2.12912 |
work_keys_str_mv | AT yafengzhang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT junyangyu smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT yuanyuanwang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT shuangtang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT hanli smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT zhiyixin smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT chaoyiwang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection AT zimingzhao smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection |