ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7
The objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/12/23/4886 |
_version_ | 1797400268095094784 |
---|---|
author | Heng Zhang Faming Shao Xiaohui He Weijun Chu Dewei Zhao Zihan Zhang Shaohua Bi |
author_facet | Heng Zhang Faming Shao Xiaohui He Weijun Chu Dewei Zhao Zihan Zhang Shaohua Bi |
author_sort | Heng Zhang |
collection | DOAJ |
description | The objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV aerial images. First, this paper introduces a feature pyramid network, AF-FPN, which is composed of an adaptive attention module (AAM) and a feature enhancement module (FEM). AF-FPN reduces the loss of deep feature information due to the reduction of feature channels in the convolution process through the AAM and FEM, strengthens the feature perception ability, and improves the detection speed and accuracy for multi-scale objects. Second, we add a prediction head based on a transformer encoder block on the basis of the three-head structure of YOLOv7, improving the ability of the model to capture global information and feature expression, thus achieving efficient detection of objects with tiny scales and dense occlusion. Moreover, as the location loss function of YOLOv7, CIoU (complete intersection over union), cannot facilitate the regression of the prediction box angle to the ground truth box—resulting in a slow convergence rate during model training—this paper proposes a loss function with angle regression, SIoU (soft intersection over union), in order to accelerate the convergence rate during model training. Finally, a series of comparative experiments are carried out on the DIOR dataset. The results indicate that ATS-YOLOv7 has the best detection accuracy (<i>mAP</i> of 87%) and meets the real-time requirements of image processing (detection speed of 94.2 FPS). |
first_indexed | 2024-03-09T01:53:06Z |
format | Article |
id | doaj.art-96e498c729074629a5f486d1a891ca1c |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-09T01:53:06Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-96e498c729074629a5f486d1a891ca1c2023-12-08T15:14:21ZengMDPI AGElectronics2079-92922023-12-011223488610.3390/electronics12234886ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7Heng Zhang0Faming Shao1Xiaohui He2Weijun Chu3Dewei Zhao4Zihan Zhang5Shaohua Bi6College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaCollege of Field Engineering, Army Engineering University of PLA, Nanjing 210007, ChinaThe objects in UAV aerial images have multiple scales, dense distribution, and occlusion, posing considerable challenges for object detection. In order to address this problem, this paper proposes a real-time multi-scale object detection method based on an improved YOLOv7 model (ATS-YOLOv7) for UAV aerial images. First, this paper introduces a feature pyramid network, AF-FPN, which is composed of an adaptive attention module (AAM) and a feature enhancement module (FEM). AF-FPN reduces the loss of deep feature information due to the reduction of feature channels in the convolution process through the AAM and FEM, strengthens the feature perception ability, and improves the detection speed and accuracy for multi-scale objects. Second, we add a prediction head based on a transformer encoder block on the basis of the three-head structure of YOLOv7, improving the ability of the model to capture global information and feature expression, thus achieving efficient detection of objects with tiny scales and dense occlusion. Moreover, as the location loss function of YOLOv7, CIoU (complete intersection over union), cannot facilitate the regression of the prediction box angle to the ground truth box—resulting in a slow convergence rate during model training—this paper proposes a loss function with angle regression, SIoU (soft intersection over union), in order to accelerate the convergence rate during model training. Finally, a series of comparative experiments are carried out on the DIOR dataset. The results indicate that ATS-YOLOv7 has the best detection accuracy (<i>mAP</i> of 87%) and meets the real-time requirements of image processing (detection speed of 94.2 FPS).https://www.mdpi.com/2079-9292/12/23/4886UAV aerial imagesobject detectionYOLOv7AF-FPNtransformer encoderSIoU |
spellingShingle | Heng Zhang Faming Shao Xiaohui He Weijun Chu Dewei Zhao Zihan Zhang Shaohua Bi ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 Electronics UAV aerial images object detection YOLOv7 AF-FPN transformer encoder SIoU |
title | ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 |
title_full | ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 |
title_fullStr | ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 |
title_full_unstemmed | ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 |
title_short | ATS-YOLOv7: A Real-Time Multi-Scale Object Detection Method for UAV Aerial Images Based on Improved YOLOv7 |
title_sort | ats yolov7 a real time multi scale object detection method for uav aerial images based on improved yolov7 |
topic | UAV aerial images object detection YOLOv7 AF-FPN transformer encoder SIoU |
url | https://www.mdpi.com/2079-9292/12/23/4886 |
work_keys_str_mv | AT hengzhang atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT famingshao atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT xiaohuihe atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT weijunchu atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT deweizhao atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT zihanzhang atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 AT shaohuabi atsyolov7arealtimemultiscaleobjectdetectionmethodforuavaerialimagesbasedonimprovedyolov7 |