VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.

Bibliographic Details
Main Author: Xu, Yuetian
Other Authors: Richard W. Madison and Tomaso A. Poggio.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2011
Subjects:
Online Access:http://hdl.handle.net/1721.1/61290
_version_ 1826201693442801664
author Xu, Yuetian
author2 Richard W. Madison and Tomaso A. Poggio.
author_facet Richard W. Madison and Tomaso A. Poggio.
Xu, Yuetian
author_sort Xu, Yuetian
collection MIT
description Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
first_indexed 2024-09-23T11:55:34Z
format Thesis
id mit-1721.1/61290
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T11:55:34Z
publishDate 2011
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/612902019-04-10T09:42:31Z VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes Video indexing with combined tracking and object recognition for improved object understanding in scenes Xu, Yuetian Richard W. Madison and Tomaso A. Poggio. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. Cataloged from PDF version of thesis. Includes bibliographical references (p. ). Automatic understanding of video content is a problem which grows in importance every day. Video understanding algorithms require accuracy, robustness, speed, and scalability. Accuracy generates user confidence in usage. Robustness enables greater autonomy and reduced human intervention. Applications such as navigation and mapping demand real-time performance. Scalability is also important for maintaining high speed while expanding capacity to multiple users and sensors. In this thesis, I propose a "bag-of-phrases" model to improve the accuracy and robustness of the popular "bag-of-words" models. This model applies a "geometric grammar" to add structural constraints to the unordered "bag-of-words." I incorporate this model into an architecture which combines an object recognizer, a tracker, and a geolocation module. This architecture has the ability to use the complementarity of its components to compensate for its weaknesses. This allows for improvements in accuracy, robustness, and speed. Subsequently, I introduce VICTORIOUS, a fast implementation of the proposed architecture. Evaluation on computer-generated data as well as Caltech-101 indicate that this implementation is accurate, robust, and capable of performing in real time on current generation hardware. This implementation, together with the "bag-of-phrases" model and integrated architecture, forms a step towards meeting the requirements for an accurate, robust, real-time vision system. by Yuetian Xu. M.Eng. 2011-02-23T14:42:33Z 2011-02-23T14:42:33Z 2009 2009 Thesis http://hdl.handle.net/1721.1/61290 702644367 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 p. application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Xu, Yuetian
VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_full VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_fullStr VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_full_unstemmed VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_short VICTORIOUS : video indexing with combined tracking and object recognition for improved object understanding in scenes
title_sort victorious video indexing with combined tracking and object recognition for improved object understanding in scenes
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/61290
work_keys_str_mv AT xuyuetian victoriousvideoindexingwithcombinedtrackingandobjectrecognitionforimprovedobjectunderstandinginscenes
AT xuyuetian videoindexingwithcombinedtrackingandobjectrecognitionforimprovedobjectunderstandinginscenes