A Trainable System for Object Detection in Images and Video Sequences

This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making a...

Full description

Bibliographic Details
Main Author:	Papageorgiou, Constantine P.
Language:	en_US
Published:	2004
Subjects:	AI MIT Artificial Intelligence object detection pattern recognition people detection face detection car detection
Online Access:	http://hdl.handle.net/1721.1/5566

_version_	1811079338653646848
author	Papageorgiou, Constantine P.
author_facet	Papageorgiou, Constantine P.
author_sort	Papageorgiou, Constantine P.
collection	MIT
description	This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making any assumptions on the scene structure or the number of objects in the scene. The system uses a set of training data of positive and negative example images as input, transforms the pixel images to a Haar wavelet representation, and uses a support vector machine classifier to learn the difference between in-class and out-of-class patterns. To detect objects in out-of-sample images, we do a brute force search over all the subwindows in the image. This system is applied to face, people, and car detection with excellent results. For our extensions to video sequences, we augment the core static detection system in several ways -- 1) extending the representation to five frames, 2) implementing an approximation to a Kalman filter, and 3) modeling detections in an image as a density and propagating this density through time according to measured features. In addition, we present a real-time version of the system that is currently running in a DaimlerChrysler experimental vehicle. As part of this thesis, we also present a system that, instead of detecting full patterns, uses a component-based approach. We find it to be more robust to occlusions, rotations in depth, and severe lighting conditions for people detection than the full body version. We also experiment with various other representations including pixels and principal components and show results that quantify how the number of features, color, and gray-level affect performance.
first_indexed	2024-09-23T11:13:37Z
id	mit-1721.1/5566
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T11:13:37Z
publishDate	2004
record_format	dspace
spelling	mit-1721.1/55662019-04-12T13:39:32Z A Trainable System for Object Detection in Images and Video Sequences Papageorgiou, Constantine P. AI MIT Artificial Intelligence object detection pattern recognition people detection face detection car detection This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making any assumptions on the scene structure or the number of objects in the scene. The system uses a set of training data of positive and negative example images as input, transforms the pixel images to a Haar wavelet representation, and uses a support vector machine classifier to learn the difference between in-class and out-of-class patterns. To detect objects in out-of-sample images, we do a brute force search over all the subwindows in the image. This system is applied to face, people, and car detection with excellent results. For our extensions to video sequences, we augment the core static detection system in several ways -- 1) extending the representation to five frames, 2) implementing an approximation to a Kalman filter, and 3) modeling detections in an image as a density and propagating this density through time according to measured features. In addition, we present a real-time version of the system that is currently running in a DaimlerChrysler experimental vehicle. As part of this thesis, we also present a system that, instead of detecting full patterns, uses a component-based approach. We find it to be more robust to occlusions, rotations in depth, and severe lighting conditions for people detection than the full body version. We also experiment with various other representations including pixels and principal components and show results that quantify how the number of features, color, and gray-level affect performance. 2004-10-01T13:59:58Z 2004-10-01T13:59:58Z 2000-05-01 AITR-1685 CBCL-186 http://hdl.handle.net/1721.1/5566 en_US AITR-1685 CBCL-186 128 p. 72537763 bytes 15910731 bytes application/postscript application/pdf application/postscript application/pdf
spellingShingle	AI MIT Artificial Intelligence object detection pattern recognition people detection face detection car detection Papageorgiou, Constantine P. A Trainable System for Object Detection in Images and Video Sequences
title	A Trainable System for Object Detection in Images and Video Sequences
title_full	A Trainable System for Object Detection in Images and Video Sequences
title_fullStr	A Trainable System for Object Detection in Images and Video Sequences
title_full_unstemmed	A Trainable System for Object Detection in Images and Video Sequences
title_short	A Trainable System for Object Detection in Images and Video Sequences
title_sort	trainable system for object detection in images and video sequences
topic	AI MIT Artificial Intelligence object detection pattern recognition people detection face detection car detection
url	http://hdl.handle.net/1721.1/5566
work_keys_str_mv	AT papageorgiouconstantinep atrainablesystemforobjectdetectioninimagesandvideosequences AT papageorgiouconstantinep trainablesystemforobjectdetectioninimagesandvideosequences

A Trainable System for Object Detection in Images and Video Sequences

Similar Items