Small object detection based on hierarchical attention mechanism and multi‐scale separable detection

Abstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal per...

Full description

Bibliographic Details
Main Authors: Yafeng Zhang, Junyang Yu, Yuanyuan Wang, Shuang Tang, Han Li, Zhiyi Xin, Chaoyi Wang, Ziming Zhao
Format: Article
Language:English
Published: Wiley 2023-12-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12912
_version_ 1797404533186363392
author Yafeng Zhang
Junyang Yu
Yuanyuan Wang
Shuang Tang
Han Li
Zhiyi Xin
Chaoyi Wang
Ziming Zhao
author_facet Yafeng Zhang
Junyang Yu
Yuanyuan Wang
Shuang Tang
Han Li
Zhiyi Xin
Chaoyi Wang
Ziming Zhao
author_sort Yafeng Zhang
collection DOAJ
description Abstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal performance due to various factors such as small target size, complex background, variability in illumination, occlusions, and target distortion. Here, a small object detection method for complex traffic scenarios named deformable local and global attention (DLGADet) is proposed, which seamlessly merges the ability of hierarchical attention mechanisms (HAMs) with the versatility of deformable multi‐scale feature fusion, effectively improving recognition and detection performance. First, DLGADet introduces the combination of multi‐scale separable detection and multi‐scale feature fusion mechanism to obtain richer contextual information for feature fusion while solving the misalignment problem of classification and localisation tasks. Second, a deformation feature extraction module (DFEM) is designed to address the deformation of objects. Finally, a HAM combining global and local attention mechanisms is designed to obtain discriminative features from complex backgrounds. Extensive experiments on three datasets demonstrate the effectiveness of the proposed methods. Code is available at https://github.com/ACAMPUS/DLGADet
first_indexed 2024-03-09T02:56:22Z
format Article
id doaj.art-b689c7317a3c4012a7359a036a13d572
institution Directory Open Access Journal
issn 1751-9659
1751-9667
language English
last_indexed 2024-03-09T02:56:22Z
publishDate 2023-12-01
publisher Wiley
record_format Article
series IET Image Processing
spelling doaj.art-b689c7317a3c4012a7359a036a13d5722023-12-05T06:22:51ZengWileyIET Image Processing1751-96591751-96672023-12-0117143986399910.1049/ipr2.12912Small object detection based on hierarchical attention mechanism and multi‐scale separable detectionYafeng Zhang0Junyang Yu1Yuanyuan Wang2Shuang Tang3Han Li4Zhiyi Xin5Chaoyi Wang6Ziming Zhao7School of Software Henan University Kaifeng Henan ChinaSchool of Software Henan University Kaifeng Henan ChinaEconomic and Technical Research Institute State Grid Henan Province Electric Power Company Zhengzhou Henan ChinaSchool of Economics Henan University Kaifeng Henan ChinaSchool of Economics Henan University Kaifeng Henan ChinaSchool of Software Henan University Kaifeng Henan ChinaElectrical and Computer Engineering Shanghai Institute of Microsystem and Information Technology Shanghai ChinaSchool of Software Henan University Kaifeng Henan ChinaAbstract The ability of modern detectors to detect small targets is still an unresolved topic compared to their capability of detecting medium and large targets in the field of object detection. Accurately detecting and identifying small objects in the real‐world scenario suffer from sub‐optimal performance due to various factors such as small target size, complex background, variability in illumination, occlusions, and target distortion. Here, a small object detection method for complex traffic scenarios named deformable local and global attention (DLGADet) is proposed, which seamlessly merges the ability of hierarchical attention mechanisms (HAMs) with the versatility of deformable multi‐scale feature fusion, effectively improving recognition and detection performance. First, DLGADet introduces the combination of multi‐scale separable detection and multi‐scale feature fusion mechanism to obtain richer contextual information for feature fusion while solving the misalignment problem of classification and localisation tasks. Second, a deformation feature extraction module (DFEM) is designed to address the deformation of objects. Finally, a HAM combining global and local attention mechanisms is designed to obtain discriminative features from complex backgrounds. Extensive experiments on three datasets demonstrate the effectiveness of the proposed methods. Code is available at https://github.com/ACAMPUS/DLGADethttps://doi.org/10.1049/ipr2.12912image recognitionobject detection
spellingShingle Yafeng Zhang
Junyang Yu
Yuanyuan Wang
Shuang Tang
Han Li
Zhiyi Xin
Chaoyi Wang
Ziming Zhao
Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
IET Image Processing
image recognition
object detection
title Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
title_full Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
title_fullStr Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
title_full_unstemmed Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
title_short Small object detection based on hierarchical attention mechanism and multi‐scale separable detection
title_sort small object detection based on hierarchical attention mechanism and multi scale separable detection
topic image recognition
object detection
url https://doi.org/10.1049/ipr2.12912
work_keys_str_mv AT yafengzhang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT junyangyu smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT yuanyuanwang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT shuangtang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT hanli smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT zhiyixin smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT chaoyiwang smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection
AT zimingzhao smallobjectdetectionbasedonhierarchicalattentionmechanismandmultiscaleseparabledetection