VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.

Bibliographic Details
Main Author:	Xu, Yuetian
Other Authors:	Richard W. Madison and Tomaso A. Poggio.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2011
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/61290

_version_	1826201693442801664
author	Xu, Yuetian
author2	Richard W. Madison and Tomaso A. Poggio.
author_facet	Richard W. Madison and Tomaso A. Poggio. Xu, Yuetian
author_sort	Xu, Yuetian
collection	MIT
description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
first_indexed	2024-09-23T11:55:34Z
format	Thesis
id	mit-1721.1/61290
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T11:55:34Z
publishDate	2011
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/612902019-04-10T09:42:31Z VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes Video indexing with combined tracking and object recognition for improved object understanding in scenes Xu, Yuetian Richard W. Madison and Tomaso A. Poggio. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. Cataloged from PDF version of thesis. Includes bibliographical references (p. ). Automatic understanding of video content is a problem which grows in importance every day. Video understanding algorithms require accuracy, robustness, speed, and scalability. Accuracy generates user confidence in usage. Robustness enables greater autonomy and reduced human intervention. Applications such as navigation and mapping demand real-time performance. Scalability is also important for maintaining high speed while expanding capacity to multiple users and sensors. In this thesis, I propose a "bag-of-phrases" model to improve the accuracy and robustness of the popular "bag-of-words" models. This model applies a "geometric grammar" to add structural constraints to the unordered "bag-of-words." I incorporate this model into an architecture which combines an object recognizer, a tracker, and a geolocation module. This architecture has the ability to use the complementarity of its components to compensate for its weaknesses. This allows for improvements in accuracy, robustness, and speed. Subsequently, I introduce VICTORIOUS, a fast implementation of the proposed architecture. Evaluation on computer-generated data as well as Caltech-101 indicate that this implementation is accurate, robust, and capable of performing in real time on current generation hardware. This implementation, together with the "bag-of-phrases" model and integrated architecture, forms a step towards meeting the requirements for an accurate, robust, real-time vision system. by Yuetian Xu. M.Eng. 2011-02-23T14:42:33Z 2011-02-23T14:42:33Z 2009 2009 Thesis http://hdl.handle.net/1721.1/61290 702644367 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 p. application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Xu, Yuetian VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title	VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_full	VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_fullStr	VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_full_unstemmed	VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_short	VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_sort	victorious video indexing with combined tracking and object recognition for improved object understanding in scenes
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/61290
work_keys_str_mv	AT xuyuetian victoriousvideoindexingwithcombinedtrackingandobjectrecognitionforimprovedobjectunderstandinginscenes AT xuyuetian videoindexingwithcombinedtrackingandobjectrecognitionforimprovedobjectunderstandinginscenes

VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes

Similar Items