DETR-crowd is all you need

"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline m...

Full description

Bibliographic Details
Main Authors: Liu Weijia, Zishen Zheng, Ke Fan, Kun He, Taiqiu Huang, Weijia Liu, Xianlun Ke, Yuming Xu
Format: Article
Language:English
Published: Siberian Scientific Centre DNIT 2023-05-01
Series:Современные инновации, системы и технологии
Online Access:https://oajmist.com/index.php/12/article/view/209
_version_ 1797816841617276928
author Liu Weijia
Zishen Zheng
Ke Fan
Kun He
Taiqiu Huang
Weijia Liu
Xianlun Ke
Yuming Xu
author_facet Liu Weijia
Zishen Zheng
Ke Fan
Kun He
Taiqiu Huang
Weijia Liu
Xianlun Ke
Yuming Xu
author_sort Liu Weijia
collection DOAJ
description "Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks.
first_indexed 2024-03-13T08:44:22Z
format Article
id doaj.art-afac1c8ccb03469fb5b55e340589736a
institution Directory Open Access Journal
issn 2782-2826
2782-2818
language English
last_indexed 2024-03-13T08:44:22Z
publishDate 2023-05-01
publisher Siberian Scientific Centre DNIT
record_format Article
series Современные инновации, системы и технологии
spelling doaj.art-afac1c8ccb03469fb5b55e340589736a2023-05-30T07:35:55ZengSiberian Scientific Centre DNITСовременные инновации, системы и технологии2782-28262782-28182023-05-013210.47813/2782-2818-2023-3-2-0213-0224DETR-crowd is all you needLiu Weijia 0Zishen Zheng 1Ke Fan 2Kun He 3Taiqiu Huang 4Weijia Liu 5Xianlun Ke 6Yuming Xu 7Trine University, Phoenix, USA Taiyuan University of Technology, Taiyuan, ChinaArizona State University, Phoenix, USA Illinois Institute of Technology, Chicago, USAShenzhen University, Shenzhen, China Trine University, Phoenix, United StatesYunnan University, Kunming, China Shenzhen University, Shenzhen, China "Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks. https://oajmist.com/index.php/12/article/view/209
spellingShingle Liu Weijia
Zishen Zheng
Ke Fan
Kun He
Taiqiu Huang
Weijia Liu
Xianlun Ke
Yuming Xu
DETR-crowd is all you need
Современные инновации, системы и технологии
title DETR-crowd is all you need
title_full DETR-crowd is all you need
title_fullStr DETR-crowd is all you need
title_full_unstemmed DETR-crowd is all you need
title_short DETR-crowd is all you need
title_sort detr crowd is all you need
url https://oajmist.com/index.php/12/article/view/209
work_keys_str_mv AT liuweijia detrcrowdisallyouneed
AT zishenzheng detrcrowdisallyouneed
AT kefan detrcrowdisallyouneed
AT kunhe detrcrowdisallyouneed
AT taiqiuhuang detrcrowdisallyouneed
AT weijialiu detrcrowdisallyouneed
AT xianlunke detrcrowdisallyouneed
AT yumingxu detrcrowdisallyouneed