Multi-scale cross-layer fusion and center position network for pedestrian detection
Pedestrian detection has made breakthroughs after the rise of convolutional neural networks. However, it faces some challenging problems, including dataset difference, small pedestrian targets and occlusions between pedestrians. To deal with these problems, we propose a novel convolutional network a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-01-01
|
Series: | Journal of King Saud University: Computer and Information Sciences |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1319157823004408 |
_version_ | 1827356713188065280 |
---|---|
author | Qian Liu Youwei Qi Cunbao Wang |
author_facet | Qian Liu Youwei Qi Cunbao Wang |
author_sort | Qian Liu |
collection | DOAJ |
description | Pedestrian detection has made breakthroughs after the rise of convolutional neural networks. However, it faces some challenging problems, including dataset difference, small pedestrian targets and occlusions between pedestrians. To deal with these problems, we propose a novel convolutional network architecture, named multi-scale cross-layer fusion and center position network (MCF-CP-NET). A new backbone unit is designed to introduce channel-wise attention into the improved aggregated residual transformations for effective feature extraction. We select suitable anchor setting for pedestrian detection datasets to tackle the problem of dataset difference. A feature pyramid sub-network with cross-layer fusion is developed for better detection of small pedestrian targets, where cross-layer connections are used to reduce the information loss and low-level marginal feature dissipation and better fuse low- and high-level features. We add a center position branch into the localization regression sub-network in MCF-CP-NET to better detect occluded pedestrians, which predicts the centrality index of the localization box to obtain the center score, and further optimizes the score of non-maximum suppression. Experiments show that the average precision and recall of MCF-CP-NET are separately improved by 1.2% and 0.7% on the person class of Pascal VOC2007 dataset and 1.6% and 0.1% on WiderPerson dataset, in comparison with the state-of-the-art. |
first_indexed | 2024-03-08T05:13:55Z |
format | Article |
id | doaj.art-369aa44cc1414a229e923c080b96aa63 |
institution | Directory Open Access Journal |
issn | 1319-1578 |
language | English |
last_indexed | 2024-03-08T05:13:55Z |
publishDate | 2024-01-01 |
publisher | Elsevier |
record_format | Article |
series | Journal of King Saud University: Computer and Information Sciences |
spelling | doaj.art-369aa44cc1414a229e923c080b96aa632024-02-07T04:42:55ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782024-01-01361101886Multi-scale cross-layer fusion and center position network for pedestrian detectionQian Liu0Youwei Qi1Cunbao Wang2Corresponding author.; School of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science & Technology, Nanjing 210044, ChinaSchool of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science & Technology, Nanjing 210044, ChinaSchool of Artificial Intelligence (School of Future Technology), Nanjing University of Information Science & Technology, Nanjing 210044, ChinaPedestrian detection has made breakthroughs after the rise of convolutional neural networks. However, it faces some challenging problems, including dataset difference, small pedestrian targets and occlusions between pedestrians. To deal with these problems, we propose a novel convolutional network architecture, named multi-scale cross-layer fusion and center position network (MCF-CP-NET). A new backbone unit is designed to introduce channel-wise attention into the improved aggregated residual transformations for effective feature extraction. We select suitable anchor setting for pedestrian detection datasets to tackle the problem of dataset difference. A feature pyramid sub-network with cross-layer fusion is developed for better detection of small pedestrian targets, where cross-layer connections are used to reduce the information loss and low-level marginal feature dissipation and better fuse low- and high-level features. We add a center position branch into the localization regression sub-network in MCF-CP-NET to better detect occluded pedestrians, which predicts the centrality index of the localization box to obtain the center score, and further optimizes the score of non-maximum suppression. Experiments show that the average precision and recall of MCF-CP-NET are separately improved by 1.2% and 0.7% on the person class of Pascal VOC2007 dataset and 1.6% and 0.1% on WiderPerson dataset, in comparison with the state-of-the-art.http://www.sciencedirect.com/science/article/pii/S1319157823004408Object detectionPedestrian detectionFeature fusionAttention |
spellingShingle | Qian Liu Youwei Qi Cunbao Wang Multi-scale cross-layer fusion and center position network for pedestrian detection Journal of King Saud University: Computer and Information Sciences Object detection Pedestrian detection Feature fusion Attention |
title | Multi-scale cross-layer fusion and center position network for pedestrian detection |
title_full | Multi-scale cross-layer fusion and center position network for pedestrian detection |
title_fullStr | Multi-scale cross-layer fusion and center position network for pedestrian detection |
title_full_unstemmed | Multi-scale cross-layer fusion and center position network for pedestrian detection |
title_short | Multi-scale cross-layer fusion and center position network for pedestrian detection |
title_sort | multi scale cross layer fusion and center position network for pedestrian detection |
topic | Object detection Pedestrian detection Feature fusion Attention |
url | http://www.sciencedirect.com/science/article/pii/S1319157823004408 |
work_keys_str_mv | AT qianliu multiscalecrosslayerfusionandcenterpositionnetworkforpedestriandetection AT youweiqi multiscalecrosslayerfusionandcenterpositionnetworkforpedestriandetection AT cunbaowang multiscalecrosslayerfusionandcenterpositionnetworkforpedestriandetection |