Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework

Pedestrian head detection plays an important role in identifying and localizing individuals in real world visual data. Head detection is a nontrivial problem due to considerable variance in camera view-points, scales, human poses, and appearances in the scene. Thanks to the translation invariance pr...

Full description

Bibliographic Details
Main Authors: Sultan Daud Khan, Yasir Ali, Basim Zafar, Abdulfattah Noorwali
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9096268/
_version_ 1818857057247297536
author Sultan Daud Khan
Yasir Ali
Basim Zafar
Abdulfattah Noorwali
author_facet Sultan Daud Khan
Yasir Ali
Basim Zafar
Abdulfattah Noorwali
author_sort Sultan Daud Khan
collection DOAJ
description Pedestrian head detection plays an important role in identifying and localizing individuals in real world visual data. Head detection is a nontrivial problem due to considerable variance in camera view-points, scales, human poses, and appearances in the scene. Thanks to the translation invariance property of convolutional neural networks (CNNs) which enables large capacity CNNs to handle the problem of appearance and pose variations in the scene. However, the problem of scale invariance is still an open issue. To address this problem, this paper presents a two-stage head detection framework that utilizes fully convolutional network (FCN) to generate scale-aware proposals followed by CNN that classifies each proposal into two classes, i.e. head and background. Experiments results show that using scale-aware proposals obtained by FCN, the object recall rate and mean average precision (mAP) are improved. Additionaly, we demonstrate that our framework achieved state-of-the-art results on four challenging benchmark datasets, i.e. HollywoodHeads, Casablanca, SHOCK, and WIDERFACE.
first_indexed 2024-12-19T08:34:20Z
format Article
id doaj.art-f5cf6823428f4ab8b0b05f276eafedca
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T08:34:20Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f5cf6823428f4ab8b0b05f276eafedca2022-12-21T20:29:06ZengIEEEIEEE Access2169-35362020-01-018986799869210.1109/ACCESS.2020.29957649096268Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution FrameworkSultan Daud Khan0https://orcid.org/0000-0002-7406-8441Yasir Ali1https://orcid.org/0000-0001-8163-8943Basim Zafar2https://orcid.org/0000-0001-5407-7941Abdulfattah Noorwali3https://orcid.org/0000-0001-9942-2526Department of Computer Science, National University of Technology, Islamabad, PakistanExpert Vision Consulting, Makkah, Saudi ArabiaExpert Vision Consulting, Makkah, Saudi ArabiaDepartment of Electrical Engineering, Umm Al-Qura University, Makkah, Saudi ArabiaPedestrian head detection plays an important role in identifying and localizing individuals in real world visual data. Head detection is a nontrivial problem due to considerable variance in camera view-points, scales, human poses, and appearances in the scene. Thanks to the translation invariance property of convolutional neural networks (CNNs) which enables large capacity CNNs to handle the problem of appearance and pose variations in the scene. However, the problem of scale invariance is still an open issue. To address this problem, this paper presents a two-stage head detection framework that utilizes fully convolutional network (FCN) to generate scale-aware proposals followed by CNN that classifies each proposal into two classes, i.e. head and background. Experiments results show that using scale-aware proposals obtained by FCN, the object recall rate and mean average precision (mAP) are improved. Additionaly, we demonstrate that our framework achieved state-of-the-art results on four challenging benchmark datasets, i.e. HollywoodHeads, Casablanca, SHOCK, and WIDERFACE.https://ieeexplore.ieee.org/document/9096268/Convolutional neural networksnon-maximal suppressionhead detectioncrowd countingmotion analysis
spellingShingle Sultan Daud Khan
Yasir Ali
Basim Zafar
Abdulfattah Noorwali
Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
IEEE Access
Convolutional neural networks
non-maximal suppression
head detection
crowd counting
motion analysis
title Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
title_full Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
title_fullStr Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
title_full_unstemmed Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
title_short Robust Head Detection in Complex Videos Using Two-Stage Deep Convolution Framework
title_sort robust head detection in complex videos using two stage deep convolution framework
topic Convolutional neural networks
non-maximal suppression
head detection
crowd counting
motion analysis
url https://ieeexplore.ieee.org/document/9096268/
work_keys_str_mv AT sultandaudkhan robustheaddetectionincomplexvideosusingtwostagedeepconvolutionframework
AT yasirali robustheaddetectionincomplexvideosusingtwostagedeepconvolutionframework
AT basimzafar robustheaddetectionincomplexvideosusingtwostagedeepconvolutionframework
AT abdulfattahnoorwali robustheaddetectionincomplexvideosusingtwostagedeepconvolutionframework