ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7

The objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV...

Full description

Bibliographic Details
Main Authors: Heng Zhang, Faming Shao, Xiaohui He, Weijun Chu, Dewei Zhao, Zihan Zhang, Shaohua Bi
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/23/4886
_version_ 1797400268095094784
author Heng Zhang
Faming Shao
Xiaohui He
Weijun Chu
Dewei Zhao
Zihan Zhang
Shaohua Bi
author_facet Heng Zhang
Faming Shao
Xiaohui He
Weijun Chu
Dewei Zhao
Zihan Zhang
Shaohua Bi
author_sort Heng Zhang
collection DOAJ
description The objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV aerial images. First, this paper introduces a feature pyramid network, AF-FPN, which is composed of an adaptive attention module (AAM) and a feature enhancement module (FEM). AF-FPN reduces the loss of deep feature information due to the reduction of feature channels in the convolution process through the AAM and FEM, strengthens the feature perception ability, and improves the detection speed and accuracy for multi-scale objects. Second, we add a prediction head based on a transformer encoder block on the basis of the three-head structure of YOLOv7, improving the ability of the model to capture global information and feature expression, thus achieving efficient detection of objects with tiny scales and dense occlusion. Moreover, as the location loss function of YOLOv7, CIoU (complete intersection over union), cannot facilitate the regression of the prediction box angle to the ground truth box—resulting in a slow convergence rate during model training—this paper proposes a loss function with angle regression, SIoU (soft intersection over union), in order to accelerate the convergence rate during model training. Finally, a series of comparative experiments are carried out on the DIOR dataset. The results indicate that ATS-YOLOv7 has the best detection accuracy (<i>mAP</i> of 87%) and meets the real-time requirements of image processing (detection speed of 94.2 FPS).
first_indexed 2024-03-09T01:53:06Z
format Article
id doaj.art-96e498c729074629a5f486d1a891ca1c
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-09T01:53:06Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-96e498c729074629a5f486d1a891ca1c2023-12-08T15:14:21ZengMDPI AGElectronics2079-92922023-12-011223488610.3390/electronics12234886ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7Heng Zhang0Faming Shao1Xiaohui He2Weijun Chu3Dewei Zhao4Zihan Zhang5Shaohua Bi6College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaThe objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV aerial images. First, this paper introduces a feature pyramid network, AF-FPN, which is composed of an adaptive attention module (AAM) and a feature enhancement module (FEM). AF-FPN reduces the loss of deep feature information due to the reduction of feature channels in the convolution process through the AAM and FEM, strengthens the feature perception ability, and improves the detection speed and accuracy for multi-scale objects. Second, we add a prediction head based on a transformer encoder block on the basis of the three-head structure of YOLOv7, improving the ability of the model to capture global information and feature expression, thus achieving efficient detection of objects with tiny scales and dense occlusion. Moreover, as the location loss function of YOLOv7, CIoU (complete intersection over union), cannot facilitate the regression of the prediction box angle to the ground truth box—resulting in a slow convergence rate during model training—this paper proposes a loss function with angle regression, SIoU (soft intersection over union), in order to accelerate the convergence rate during model training. Finally, a series of comparative experiments are carried out on the DIOR dataset. The results indicate that ATS-YOLOv7 has the best detection accuracy (<i>mAP</i> of 87%) and meets the real-time requirements of image processing (detection speed of 94.2 FPS).https://www.mdpi.com/2079-9292/12/23/4886UAV aerial imagesobject detectionYOLOv7AF-FPNtransformer encoderSIoU
spellingShingle Heng Zhang
Faming Shao
Xiaohui He
Weijun Chu
Dewei Zhao
Zihan Zhang
Shaohua Bi
ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
Electronics
UAV aerial images
object detection
YOLOv7
AF-FPN
transformer encoder
SIoU
title ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
title_full ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
title_fullStr ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
title_full_unstemmed ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
title_short ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
title_sort ats yolov7 a real time multi scale object detection method for uav aerial images based on improved yolov7
topic UAV aerial images
object detection
YOLOv7
AF-FPN
transformer encoder
SIoU
url https://www.mdpi.com/2079-9292/12/23/4886
work_keys_str_mv AT hengzhang atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT famingshao atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT xiaohuihe atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT weijunchu atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT deweizhao atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT zihanzhang atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7
AT shaohuabi atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7