YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection

The detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale targe...

Full description

Bibliographic Details
Main Authors: Xiaofeng Zhao, Yuting Xia, Wenwen Zhang, Chao Zheng, Zhili Zhang
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/15/3778
_version_ 1827730925647036416
author Xiaofeng Zhao
Yuting Xia
Wenwen Zhang
Chao Zheng
Zhili Zhang
author_facet Xiaofeng Zhao
Yuting Xia
Wenwen Zhang
Chao Zheng
Zhili Zhang
author_sort Xiaofeng Zhao
collection DOAJ
description The detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale target detection method for infrared vehicle targets is proposed, which is termed YOLO-ViT. Firstly, within the YOLOV7-based framework, the lightweight MobileViT network is incorporated as the feature extraction backbone network to fully extract the local and global features of the object and reduce the complexity of the model. Secondly, an innovative C3-PANet neural network structure is delicately designed, which adopts the CARAFE upsampling method to utilize the semantic information in the feature map and improve the model’s recognition accuracy of the target region. In conjunction with the C3 structure, the receptive field will be increased to enhance the network’s accuracy in recognizing small targets and model generalization ability. Finally, the K-means++ clustering method is utilized to optimize the anchor box size, leading to the design of anchor boxes better suited for detecting small infrared targets from UAVs, thereby improving detection efficiency. The present article showcases experimental findings attained through the use of the HIT-UAV public dataset. The results demonstrate that the enhanced YOLO-ViT approach, in comparison to the original method, achieves a reduction in the number of parameters by 49.9% and floating-point operations by 67.9%. Furthermore, the mean average precision (mAP) exhibits an improvement of 0.9% over the existing algorithm, reaching a value of 94.5%, which validates the effectiveness of the method for UAV infrared vehicle target detection.
first_indexed 2024-03-11T00:17:27Z
format Article
id doaj.art-62544d2ae1f348feb85ba214ffcd3814
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-11T00:17:27Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-62544d2ae1f348feb85ba214ffcd38142023-11-18T23:30:41ZengMDPI AGRemote Sensing2072-42922023-07-011515377810.3390/rs15153778YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target DetectionXiaofeng Zhao0Yuting Xia1Wenwen Zhang2Chao Zheng3Zhili Zhang4Xi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaXi’an Research Institute of High-Tech, Xi’an 710025, ChinaThe detection of infrared vehicle targets by UAVs poses significant challenges in the presence of complex ground backgrounds, high target density, and a large proportion of small targets, which result in high false alarm rates. To alleviate these deficiencies, a novel YOLOv7-based, multi-scale target detection method for infrared vehicle targets is proposed, which is termed YOLO-ViT. Firstly, within the YOLOV7-based framework, the lightweight MobileViT network is incorporated as the feature extraction backbone network to fully extract the local and global features of the object and reduce the complexity of the model. Secondly, an innovative C3-PANet neural network structure is delicately designed, which adopts the CARAFE upsampling method to utilize the semantic information in the feature map and improve the model’s recognition accuracy of the target region. In conjunction with the C3 structure, the receptive field will be increased to enhance the network’s accuracy in recognizing small targets and model generalization ability. Finally, the K-means++ clustering method is utilized to optimize the anchor box size, leading to the design of anchor boxes better suited for detecting small infrared targets from UAVs, thereby improving detection efficiency. The present article showcases experimental findings attained through the use of the HIT-UAV public dataset. The results demonstrate that the enhanced YOLO-ViT approach, in comparison to the original method, achieves a reduction in the number of parameters by 49.9% and floating-point operations by 67.9%. Furthermore, the mean average precision (mAP) exhibits an improvement of 0.9% over the existing algorithm, reaching a value of 94.5%, which validates the effectiveness of the method for UAV infrared vehicle target detection.https://www.mdpi.com/2072-4292/15/15/3778unmanned aerial vehicle target detectionvehicle detectioninfrared small targetdeep learningYolov7
spellingShingle Xiaofeng Zhao
Yuting Xia
Wenwen Zhang
Chao Zheng
Zhili Zhang
YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
Remote Sensing
unmanned aerial vehicle target detection
vehicle detection
infrared small target
deep learning
Yolov7
title YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
title_full YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
title_fullStr YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
title_full_unstemmed YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
title_short YOLO-ViT-Based Method for Unmanned Aerial Vehicle Infrared Vehicle Target Detection
title_sort yolo vit based method for unmanned aerial vehicle infrared vehicle target detection
topic unmanned aerial vehicle target detection
vehicle detection
infrared small target
deep learning
Yolov7
url https://www.mdpi.com/2072-4292/15/15/3778
work_keys_str_mv AT xiaofengzhao yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection
AT yutingxia yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection
AT wenwenzhang yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection
AT chaozheng yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection
AT zhilizhang yolovitbasedmethodforunmannedaerialvehicleinfraredvehicletargetdetection