DETR-crowd is all you need

"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline m...

Full description

Bibliographic Details
Main Authors:	Liu Weijia, Zishen Zheng, Ke Fan, Kun He, Taiqiu Huang, Weijia Liu, Xianlun Ke, Yuming Xu
Format:	Article
Language:	English
Published:	Siberian Scientific Centre DNIT 2023-05-01
Series:	Современные инновации, системы и технологии
Online Access:	https://oajmist.com/index.php/12/article/view/209

_version_	1797816841617276928
author	Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu
author_facet	Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu
author_sort	Liu Weijia
collection	DOAJ
description	"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks.
first_indexed	2024-03-13T08:44:22Z
format	Article
id	doaj.art-afac1c8ccb03469fb5b55e340589736a
institution	Directory Open Access Journal
issn	2782-2826 2782-2818
language	English
last_indexed	2024-03-13T08:44:22Z
publishDate	2023-05-01
publisher	Siberian Scientific Centre DNIT
record_format	Article
series	Современные инновации, системы и технологии
spelling	doaj.art-afac1c8ccb03469fb5b55e340589736a2023-05-30T07:35:55ZengSiberian Scientific Centre DNITСовременные инновации, системы и технологии2782-28262782-28182023-05-013210.47813/2782-2818-2023-3-2-0213-0224DETR-crowd is all you needLiu Weijia 0Zishen Zheng 1Ke Fan 2Kun He 3Taiqiu Huang 4Weijia Liu 5Xianlun Ke 6Yuming Xu 7Trine University, Phoenix, USA Taiyuan University of Technology, Taiyuan, ChinaArizona State University, Phoenix, USA Illinois Institute of Technology, Chicago, USAShenzhen University, Shenzhen, China Trine University, Phoenix, United StatesYunnan University, Kunming, China Shenzhen University, Shenzhen, China "Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks. https://oajmist.com/index.php/12/article/view/209
spellingShingle	Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu DETR-crowd is all you need Современные инновации, системы и технологии
title	DETR-crowd is all you need
title_full	DETR-crowd is all you need
title_fullStr	DETR-crowd is all you need
title_full_unstemmed	DETR-crowd is all you need
title_short	DETR-crowd is all you need
title_sort	detr crowd is all you need
url	https://oajmist.com/index.php/12/article/view/209
work_keys_str_mv	AT liuweijia detrcrowdisallyouneed AT zishenzheng detrcrowdisallyouneed AT kefan detrcrowdisallyouneed AT kunhe detrcrowdisallyouneed AT taiqiuhuang detrcrowdisallyouneed AT weijialiu detrcrowdisallyouneed AT xianlunke detrcrowdisallyouneed AT yumingxu detrcrowdisallyouneed

DETR-crowd is all you need

Similar Items