ESF-YOLO: an accurate and universal object detector based on neural networks

As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Samp...

Full description

Bibliographic Details
Main Authors:	Wenguang Tao, Xiaotian Wang, Tian Yan, Zhengzhuo Liu, Shizheng Wan
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2024-04-01
Series:	Frontiers in Neuroscience
Subjects:	neural network object detection cross-scale feature fusion attention mechanism lightweight decoupled head dynamic loss function
Online Access:	https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/full

_version_	1797219347806027776
author	Wenguang Tao Xiaotian Wang Tian Yan Zhengzhuo Liu Shizheng Wan
author_facet	Wenguang Tao Xiaotian Wang Tian Yan Zhengzhuo Liu Shizheng Wan
author_sort	Wenguang Tao
collection	DOAJ
description	As an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network’s learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model’s broad applicability.
first_indexed	2024-04-24T12:32:12Z
format	Article
id	doaj.art-bba4d4bfd4d8401094d6d5c949aad24f
institution	Directory Open Access Journal
issn	1662-453X
language	English
last_indexed	2024-04-24T12:32:12Z
publishDate	2024-04-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Neuroscience
spelling	doaj.art-bba4d4bfd4d8401094d6d5c949aad24f2024-04-08T04:28:43ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2024-04-011810.3389/fnins.2024.13714181371418ESF-YOLO: an accurate and universal object detector based on neural networksWenguang Tao0Xiaotian Wang1Tian Yan2Zhengzhuo Liu3Shizheng Wan4Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaUnmanned System Research Institute, Northwestern Polytechnical University, Xi’an, ChinaShanghai Electro-Mechanical Engineering Institute, Shanghai, ChinaAs an excellent single-stage object detector based on neural networks, YOLOv5 has found extensive applications in the industrial domain; however, it still exhibits certain design limitations. To address these issues, this paper proposes Efficient Scale Fusion YOLO (ESF-YOLO). Firstly, the Multi-Sampling Conv Module (MSCM) is designed, which enhances the backbone network’s learning capability for low-level features through multi-scale receptive fields and cross-scale feature fusion. Secondly, to tackle occlusion issues, a new Block-wise Channel Attention Module (BCAM) is designed, assigning greater weights to channels corresponding to critical information. Next, a lightweight Decoupled Head (LD-Head) is devised. Additionally, the loss function is redesigned to address asynchrony between labels and confidences, alleviating the imbalance between positive and negative samples during the neural network training. Finally, an adaptive scale factor for Intersection over Union (IoU) calculation is innovatively proposed, adjusting bounding box sizes adaptively to accommodate targets of different sizes in the dataset. Experimental results on the SODA10M and CBIA8K datasets demonstrate that ESF-YOLO increases Average Precision at 0.50 IoU (AP50) by 3.93 and 2.24%, Average Precision at 0.75 IoU (AP75) by 4.77 and 4.85%, and mean Average Precision (mAP) by 4 and 5.39%, respectively, validating the model’s broad applicability.https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/fullneural networkobject detectioncross-scale feature fusionattention mechanismlightweight decoupled headdynamic loss function
spellingShingle	Wenguang Tao Xiaotian Wang Tian Yan Zhengzhuo Liu Shizheng Wan ESF-YOLO: an accurate and universal object detector based on neural networks Frontiers in Neuroscience neural network object detection cross-scale feature fusion attention mechanism lightweight decoupled head dynamic loss function
title	ESF-YOLO: an accurate and universal object detector based on neural networks
title_full	ESF-YOLO: an accurate and universal object detector based on neural networks
title_fullStr	ESF-YOLO: an accurate and universal object detector based on neural networks
title_full_unstemmed	ESF-YOLO: an accurate and universal object detector based on neural networks
title_short	ESF-YOLO: an accurate and universal object detector based on neural networks
title_sort	esf yolo an accurate and universal object detector based on neural networks
topic	neural network object detection cross-scale feature fusion attention mechanism lightweight decoupled head dynamic loss function
url	https://www.frontiersin.org/articles/10.3389/fnins.2024.1371418/full
work_keys_str_mv	AT wenguangtao esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks AT xiaotianwang esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks AT tianyan esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks AT zhengzhuoliu esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks AT shizhengwan esfyoloanaccurateanduniversalobjectdetectorbasedonneuralnetworks

ESF-YOLO: an accurate and universal object detector based on neural networks

Similar Items