ESF-YOLO: an accurate and universal object detector based on neural networks

As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Samp...

Full description

Bibliographic Details
Main Authors: Wenguang Tao, Xiaotian Wang, Tian Yan, Zhengzhuo Liu, Shizheng Wan
Format: Article
Language:English
Published: Frontiers Media S.A. 2024-04-01
Series:Frontiers in Neuroscience
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/full
_version_ 1797219347806027776
author Wenguang Tao
Xiaotian Wang
Tian Yan
Zhengzhuo Liu
Shizheng Wan
author_facet Wenguang Tao
Xiaotian Wang
Tian Yan
Zhengzhuo Liu
Shizheng Wan
author_sort Wenguang Tao
collection DOAJ
description As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network’s learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model’s broad applicability.
first_indexed 2024-04-24T12:32:12Z
format Article
id doaj.art-bba4d4bfd4d8401094d6d5c949aad24f
institution Directory Open Access Journal
issn 1662-453X
language English
last_indexed 2024-04-24T12:32:12Z
publishDate 2024-04-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neuroscience
spelling doaj.art-bba4d4bfd4d8401094d6d5c949aad24f2024-04-08T04:28:43ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2024-04-011810.3389/fnins.2024.13714181371418ESF-YOLO: an accurate and universal object detector based on neural networksWenguang Tao0Xiaotian Wang1Tian Yan2Zhengzhuo Liu3Shizheng Wan4Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaShanghai Electro-Mechanical Engineering Institute, Shanghai, ChinaAs an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network’s learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model’s broad applicability.https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/fullneural networkobject detectioncross-scale feature fusionattention mechanismlightweight decoupled headdynamic loss function
spellingShingle Wenguang Tao
Xiaotian Wang
Tian Yan
Zhengzhuo Liu
Shizheng Wan
ESF-YOLO: an accurate and universal object detector based on neural networks
Frontiers in Neuroscience
neural network
object detection
cross-scale feature fusion
attention mechanism
lightweight decoupled head
dynamic loss function
title ESF-YOLO: an accurate and universal object detector based on neural networks
title_full ESF-YOLO: an accurate and universal object detector based on neural networks
title_fullStr ESF-YOLO: an accurate and universal object detector based on neural networks
title_full_unstemmed ESF-YOLO: an accurate and universal object detector based on neural networks
title_short ESF-YOLO: an accurate and universal object detector based on neural networks
title_sort esf yolo an accurate and universal object detector based on neural networks
topic neural network
object detection
cross-scale feature fusion
attention mechanism
lightweight decoupled head
dynamic loss function
url https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/full
work_keys_str_mv AT wenguangtao esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks
AT xiaotianwang esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks
AT tianyan esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks
AT zhengzhuoliu esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks
AT shizhengwan esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks