DETR-crowd is all you need
"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline m...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Siberian Scientific Centre DNIT
2023-05-01
|
Series: | Современные инновации, системы и технологии |
Online Access: | https://oajmist.com/index.php/12/article/view/209 |
_version_ | 1797816841617276928 |
---|---|
author | Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu |
author_facet | Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu |
author_sort | Liu Weijia |
collection | DOAJ |
description |
"Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks.
|
first_indexed | 2024-03-13T08:44:22Z |
format | Article |
id | doaj.art-afac1c8ccb03469fb5b55e340589736a |
institution | Directory Open Access Journal |
issn | 2782-2826 2782-2818 |
language | English |
last_indexed | 2024-03-13T08:44:22Z |
publishDate | 2023-05-01 |
publisher | Siberian Scientific Centre DNIT |
record_format | Article |
series | Современные инновации, системы и технологии |
spelling | doaj.art-afac1c8ccb03469fb5b55e340589736a2023-05-30T07:35:55ZengSiberian Scientific Centre DNITСовременные инновации, системы и технологии2782-28262782-28182023-05-013210.47813/2782-2818-2023-3-2-0213-0224DETR-crowd is all you needLiu Weijia 0Zishen Zheng 1Ke Fan 2Kun He 3Taiqiu Huang 4Weijia Liu 5Xianlun Ke 6Yuming Xu 7Trine University, Phoenix, USA Taiyuan University of Technology, Taiyuan, ChinaArizona State University, Phoenix, USA Illinois Institute of Technology, Chicago, USAShenzhen University, Shenzhen, China Trine University, Phoenix, United StatesYunnan University, Kunming, China Shenzhen University, Shenzhen, China "Crowded pedestrian detection" is a hot topic in the field of pedestrian detection. To address the issue of missed targets and small pedestrians in crowded scenes, an improved DETR object detection algorithm called DETR-crowd is proposed. The attention model DETR is used as the baseline model to complete object detection in the absence of partial features in crowded pedestrian scenes. The deformable attention encoder is introduced to effectively utilize multi-scale feature maps containing a large amount of small target information to improve the detection accuracy of small pedestrians. To enhance the efficiency of important feature extraction and refinement, the improved EfficientNet backbone network fused with a channel spatial attention module is used for feature extraction. To address the issue of low training efficiency of models that use attention detection modules, Smooth-L1 and GIOU are combined as the loss function during training, allowing the model to converge to higher precision. Experimental results on the Wider-Person crowded pedestrian detection dataset show that the proposed algorithm leads YOLO-X by 0.039 in AP50 accuracy and YOLO-V5 by 0.015 in AP50 accuracy. The proposed algorithm can be effectively applied to crowded pedestrian detection tasks. https://oajmist.com/index.php/12/article/view/209 |
spellingShingle | Liu Weijia Zishen Zheng Ke Fan Kun He Taiqiu Huang Weijia Liu Xianlun Ke Yuming Xu DETR-crowd is all you need Современные инновации, системы и технологии |
title | DETR-crowd is all you need |
title_full | DETR-crowd is all you need |
title_fullStr | DETR-crowd is all you need |
title_full_unstemmed | DETR-crowd is all you need |
title_short | DETR-crowd is all you need |
title_sort | detr crowd is all you need |
url | https://oajmist.com/index.php/12/article/view/209 |
work_keys_str_mv | AT liuweijia detrcrowdisallyouneed AT zishenzheng detrcrowdisallyouneed AT kefan detrcrowdisallyouneed AT kunhe detrcrowdisallyouneed AT taiqiuhuang detrcrowdisallyouneed AT weijialiu detrcrowdisallyouneed AT xianlunke detrcrowdisallyouneed AT yumingxu detrcrowdisallyouneed |