YOLO-UAV: Object Detection Method of Unmanned Aerial Vehicle Imagery Based on Efficient Multi-Scale Feature Fusion

As Unmanned Aerial Vehicle (UAV) remote sensing technology progresses, the utilization of deep learning in UAV imagery object detection has become more prevalent. However, detecting small targets in complex backgrounds and distinguishing dense targets remains a major challenge. To address these issu...

Full description

Bibliographic Details
Main Authors: Chengji Ma, Yanyun Fu, Deyong Wang, Rui Guo, Xueyi Zhao, Jian Fang
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10305077/
Description
Summary:As Unmanned Aerial Vehicle (UAV) remote sensing technology progresses, the utilization of deep learning in UAV imagery object detection has become more prevalent. However, detecting small targets in complex backgrounds and distinguishing dense targets remains a major challenge. To address these issues and improve object detection efficiency, this study proposes an UAV imagery object detection method called YOLO-UAV by optimizing YOLOv5. YOLO-UAV first reconstructs the backbone and feature fusion networks by simplifying the network structure and reducing computational burden. The employment of a Dense&#x005F;CSPDarknet53 backbone network, fashioned via the incorporation of dense connections, facilitates the extraction of latent image information through the recurrent utilization of features. In the Neck structure, an efficient feature fusion block with structural re-parameterization and ELAN strategies is integrated to effectively reduce interference from complex background noise while extracting more accurate and rich features. In addition, by proposing GS-Decoupled Head, this approach diminishes the parameter count of the decoupled head without compromising accuracy. It also separates classification tasks from regression tasks to lessen the influence of task disparities on prediction bias. To tackle the discrepancy between positive and negative samples in bounding box regression tasks, this study introduces a new loss function, Focal-ECIoU, capable of expediting network convergence and improve model positioning ability. Experimental findings from the public VisDrone2019 dataset indicate that YOLO-UAV outperforms other advanced object detection methods in comprehensive performance. Compared with the baseline model YOLOv5s, YOLO-UAV increased mAP0.5 from 35.1&#x0025; to 46.7&#x0025;, while mAP0.5:0.95 increased from 19.1&#x0025; to 27.4&#x0025;. For small-scale targets, AP<inline-formula> <tex-math notation="LaTeX">$_{small}$ </tex-math></inline-formula> increased from 10.2&#x0025; to 17.3&#x0025;. The experiment proves that YOLO-UAV performs well in improving object detection accuracy and has strong generalization ability, satisfying the practical requirements of UAV imagery object detection tasks.
ISSN:2169-3536