Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation
State-of-the-art-methods for counting persons in dense crowded places lack in estimating accurate crowd density due to following reasons. They typically apply the same filters over a complete image or over big image patches. Only then the perspective distortion can be compensated by estimating loca...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sukkur IBA University
2022-06-01
|
Series: | Sukkur IBA Journal of Emerging Technologies |
Subjects: | |
Online Access: | http://journal.iba-suk.edu.pk:8089/SIBAJournals/index.php/sjet/article/view/1023 |
_version_ | 1811341382816628736 |
---|---|
author | Ali Raza Radhan Fareed Ahmed Jokhio Ghulam Hussain Kamran Javed Arsalan Ahmed |
author_facet | Ali Raza Radhan Fareed Ahmed Jokhio Ghulam Hussain Kamran Javed Arsalan Ahmed |
author_sort | Ali Raza Radhan |
collection | DOAJ |
description |
State-of-the-art-methods for counting persons in dense crowded places lack in estimating accurate crowd density due to following reasons. They typically apply the same filters over a complete image or over big image patches. Only then the perspective distortion can be compensated by estimating local scale. It is achieved by training an additional classifier with the optimal kernel size chosen from limited choices. These methods are restricted to the context they are applied on because they are not end-to-end trainable; cannot justify quick scale changes because they allocate a single scale to big image patches; and can only utilize a narrow range of receptive fields for the networks to be of a feasible size. In this study, we bring in an end-to-end trainable deep architecture that merges features achieved from multiple kernels of different sizes and learns various essential features such as quick scale changes and to utilize the right context at each image location. This technique flexibly encodes scale of related information to precisely predict crowd density. The training and validation loss of the proposed approach is 5% and 4% lower than the state-of-the-art context aware method, respectively.
|
first_indexed | 2024-04-13T18:54:41Z |
format | Article |
id | doaj.art-902a4c5b199d4cdba5549d7cec7a0ba7 |
institution | Directory Open Access Journal |
issn | 2616-7069 2617-3115 |
language | English |
last_indexed | 2024-04-13T18:54:41Z |
publishDate | 2022-06-01 |
publisher | Sukkur IBA University |
record_format | Article |
series | Sukkur IBA Journal of Emerging Technologies |
spelling | doaj.art-902a4c5b199d4cdba5549d7cec7a0ba72022-12-22T02:34:18ZengSukkur IBA UniversitySukkur IBA Journal of Emerging Technologies2616-70692617-31152022-06-015110.30537/sjet.v5i1.1023Multi-Scale Pooling In Deep Neural Networks For Dense Crowd EstimationAli Raza Radhan0Fareed Ahmed Jokhio1Ghulam Hussain2Kamran Javed3Arsalan Ahmed4Dept. Electronic Engineering, Quaid-e-Awam University, Larkana, Pakistan Computer System Engineering, Quaid-e-Awam University, Nawabshah,PakistanElectronic Engneering, Quaid-e-Awam University, LarkanaNational Centre of Artificial Intelligence (NCAI), Saudi Data and Artificial Intelligence Authority (SDAIA), Riyadh, Saudi ArabiaDept. Electronic Engineering, Quaid-e-Awam University, Larkana, Pakistan State-of-the-art-methods for counting persons in dense crowded places lack in estimating accurate crowd density due to following reasons. They typically apply the same filters over a complete image or over big image patches. Only then the perspective distortion can be compensated by estimating local scale. It is achieved by training an additional classifier with the optimal kernel size chosen from limited choices. These methods are restricted to the context they are applied on because they are not end-to-end trainable; cannot justify quick scale changes because they allocate a single scale to big image patches; and can only utilize a narrow range of receptive fields for the networks to be of a feasible size. In this study, we bring in an end-to-end trainable deep architecture that merges features achieved from multiple kernels of different sizes and learns various essential features such as quick scale changes and to utilize the right context at each image location. This technique flexibly encodes scale of related information to precisely predict crowd density. The training and validation loss of the proposed approach is 5% and 4% lower than the state-of-the-art context aware method, respectively. http://journal.iba-suk.edu.pk:8089/SIBAJournals/index.php/sjet/article/view/1023Perspective Distortion, local scale, image patches, crowd counting, deep learning |
spellingShingle | Ali Raza Radhan Fareed Ahmed Jokhio Ghulam Hussain Kamran Javed Arsalan Ahmed Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation Sukkur IBA Journal of Emerging Technologies Perspective Distortion, local scale, image patches, crowd counting, deep learning |
title | Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation |
title_full | Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation |
title_fullStr | Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation |
title_full_unstemmed | Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation |
title_short | Multi-Scale Pooling In Deep Neural Networks For Dense Crowd Estimation |
title_sort | multi scale pooling in deep neural networks for dense crowd estimation |
topic | Perspective Distortion, local scale, image patches, crowd counting, deep learning |
url | http://journal.iba-suk.edu.pk:8089/SIBAJournals/index.php/sjet/article/view/1023 |
work_keys_str_mv | AT alirazaradhan multiscalepoolingindeepneuralnetworksfordensecrowdestimation AT fareedahmedjokhio multiscalepoolingindeepneuralnetworksfordensecrowdestimation AT ghulamhussain multiscalepoolingindeepneuralnetworksfordensecrowdestimation AT kamranjaved multiscalepoolingindeepneuralnetworksfordensecrowdestimation AT arsalanahmed multiscalepoolingindeepneuralnetworksfordensecrowdestimation |