Fast human detection with cascaded ensembles
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/57684 |
_version_ | 1811083883475632128 |
---|---|
author | Bilgic̦, Berkin |
author2 | Ichiro Masaki and Berthold K.P. Horn. |
author_facet | Ichiro Masaki and Berthold K.P. Horn. Bilgic̦, Berkin |
author_sort | Bilgic̦, Berkin |
collection | MIT |
description | Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. |
first_indexed | 2024-09-23T12:41:08Z |
format | Thesis |
id | mit-1721.1/57684 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T12:41:08Z |
publishDate | 2010 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/576842019-04-13T00:05:52Z Fast human detection with cascaded ensembles Bilgic̦, Berkin Ichiro Masaki and Berthold K.P. Horn. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. Cataloged from PDF version of thesis. Includes bibliographical references (p. 75-78). Detecting people in images is a challenging task because of the variability in clothing and illumination conditions, and the wide range of poses that people can adopt. To discriminate the human shape clearly, Dalal and Triggs [1] proposed a gradient based, robust feature set that yielded excellent detection results. This method computes locally normalized gradient orientation histograms over blocks of size 16x16 pixels representing a detection window. The block histograms within the window are then concatenated. The resulting feature vector is powerful enough to detect people with 88% detection rate at 10 -4 false positives per window (FPPW) using a linear SVM. The detection window slides over the image in all possible image scales; hence this is computationally expensive, being able to run at 1 FPS for a 320x240 image on a typical CPU with a sparse scanning methodology. Due to its simplicity and high descriptive power, several authors worked on the Dalal-Triggs algorithm to make it feasible for real time detection. One such approach is to implement this method on a Graphics Processing Unit (GPU), exploiting the parallelisms in the algorithm. Another way is to formulate the detector as an attentional cascade, so as to allow early rejections to decrease the detection time. Zhu et al. [2] demonstrated that it is possible to obtain a 30x speed up over the original algorithm with this methodology. (cont.) In this thesis, we combine the two proposed methods and investigate the feasibility of a fast person localization framework that integrates the cascade-of-rejectors approach with the Histograms of Oriented Gradients (HoG) features on a data parallel architecture. The salient features of people are captured by HoG blocks of variable sizes and locations which are chosen by the AdaBoost algorithm from a large set of possible blocks. We use the integral image representation for histogram computation and a rejection cascade in a sliding-windows manner, both of which can be implemented in a data parallel fashion. Utilizing the NVIDIA CUDA framework to realize this method on a Graphics Processing Unit (GPU), we report a speed up by a factor of 13 over our CPU implementation. For a 1280x960 image our parallel technique attains a processing speed of 2.5 to 8 frames per second depending on the image scanning density, with a detection quality comparable to the original HoG algorithm. by Berkin Bilgic̦. S.M. 2010-08-30T14:34:37Z 2010-08-30T14:34:37Z 2010 2010 Thesis http://hdl.handle.net/1721.1/57684 635559907 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 78 p. application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Bilgic̦, Berkin Fast human detection with cascaded ensembles |
title | Fast human detection with cascaded ensembles |
title_full | Fast human detection with cascaded ensembles |
title_fullStr | Fast human detection with cascaded ensembles |
title_full_unstemmed | Fast human detection with cascaded ensembles |
title_short | Fast human detection with cascaded ensembles |
title_sort | fast human detection with cascaded ensembles |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/57684 |
work_keys_str_mv | AT bilgicberkin fasthumandetectionwithcascadedensembles |