YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
The detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale targe...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-07-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/15/3778 |
_version_ | 1827730925647036416 |
---|---|
author | Xiaofeng Zhao Yuting Xia Wenwen Zhang Chao Zheng Zhili Zhang |
author_facet | Xiaofeng Zhao Yuting Xia Wenwen Zhang Chao Zheng Zhili Zhang |
author_sort | Xiaofeng Zhao |
collection | DOAJ |
description | The detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale target detection method for infrared vehicle targets is proposed, which is termed YOLO-ViT. Firstly, within the YOLOV7-based framework, the lightweight MobileViT network is incorporated as the feature extraction backbone network to fully extract the local and global features of the object and reduce the complexity of the model. Secondly, an innovative C3-PANet neural network structure is delicately designed, which adopts the CARAFE upsampling method to utilize the semantic information in the feature map and improve the model’s recognition accuracy of the target region. In conjunction with the C3 structure, the receptive field will be increased to enhance the network’s accuracy in recognizing small targets and model generalization ability. Finally, the K-means++ clustering method is utilized to optimize the anchor box size, leading to the design of anchor boxes better suited for detecting small infrared targets from UAVs, thereby improving detection efficiency. The present article showcases experimental findings attained through the use of the HIT-UAV public dataset. The results demonstrate that the enhanced YOLO-ViT approach, in comparison to the original method, achieves a reduction in the number of parameters by 49.9% and floating-point operations by 67.9%. Furthermore, the mean average precision (mAP) exhibits an improvement of 0.9% over the existing algorithm, reaching a value of 94.5%, which validates the effectiveness of the method for UAV infrared vehicle target detection. |
first_indexed | 2024-03-11T00:17:27Z |
format | Article |
id | doaj.art-62544d2ae1f348feb85ba214ffcd3814 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T00:17:27Z |
publishDate | 2023-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-62544d2ae1f348feb85ba214ffcd38142023-11-18T23:30:41ZengMDPI AGRemote Sensing2072-42922023-07-011515377810.3390/rs15153778YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target DetectionXiaofeng Zhao0Yuting Xia1Wenwen Zhang2Chao Zheng3Zhili Zhang4Xi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaThe detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale target detection method for infrared vehicle targets is proposed, which is termed YOLO-ViT. Firstly, within the YOLOV7-based framework, the lightweight MobileViT network is incorporated as the feature extraction backbone network to fully extract the local and global features of the object and reduce the complexity of the model. Secondly, an innovative C3-PANet neural network structure is delicately designed, which adopts the CARAFE upsampling method to utilize the semantic information in the feature map and improve the model’s recognition accuracy of the target region. In conjunction with the C3 structure, the receptive field will be increased to enhance the network’s accuracy in recognizing small targets and model generalization ability. Finally, the K-means++ clustering method is utilized to optimize the anchor box size, leading to the design of anchor boxes better suited for detecting small infrared targets from UAVs, thereby improving detection efficiency. The present article showcases experimental findings attained through the use of the HIT-UAV public dataset. The results demonstrate that the enhanced YOLO-ViT approach, in comparison to the original method, achieves a reduction in the number of parameters by 49.9% and floating-point operations by 67.9%. Furthermore, the mean average precision (mAP) exhibits an improvement of 0.9% over the existing algorithm, reaching a value of 94.5%, which validates the effectiveness of the method for UAV infrared vehicle target detection.https://www.mdpi.com/2072-4292/15/15/3778unmanned aerial vehicle target detectionvehicle detectioninfrared small targetdeep learningYolov7 |
spellingShingle | Xiaofeng Zhao Yuting Xia Wenwen Zhang Chao Zheng Zhili Zhang YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection Remote Sensing unmanned aerial vehicle target detection vehicle detection infrared small target deep learning Yolov7 |
title | YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection |
title_full | YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection |
title_fullStr | YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection |
title_full_unstemmed | YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection |
title_short | YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection |
title_sort | yolo vit based method for unmanned aerial vehicle infrared vehicle target detection |
topic | unmanned aerial vehicle target detection vehicle detection infrared small target deep learning Yolov7 |
url | https://www.mdpi.com/2072-4292/15/15/3778 |
work_keys_str_mv | AT xiaofengzhao yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection AT yutingxia yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection AT wenwenzhang yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection AT chaozheng yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection AT zhilizhang yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection |