Toward visual understanding of everyday object

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.

Bibliographic Details
Main Author: Lim, Joseph J. (Joseph Jaewhan)
Other Authors: Antonio Torralba.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2016
Subjects:
Online Access:http://hdl.handle.net/1721.1/101574
_version_ 1811094927566700544
author Lim, Joseph J. (Joseph Jaewhan)
author2 Antonio Torralba.
author_facet Antonio Torralba.
Lim, Joseph J. (Joseph Jaewhan)
author_sort Lim, Joseph J. (Joseph Jaewhan)
collection MIT
description Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
first_indexed 2024-09-23T16:07:44Z
format Thesis
id mit-1721.1/101574
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T16:07:44Z
publishDate 2016
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1015742019-04-11T03:23:06Z Toward visual understanding of everyday object Lim, Joseph J. (Joseph Jaewhan) Antonio Torralba. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. Cataloged from PDF version of thesis. Includes bibliographical references (pages 83-92). The computer vision community has made impressive progress on object recognition using large scale data. However, for any visual system to interact with objects, it needs to understand much more than simply recognizing where the objects are. The goal of my research is to explore and solve object understanding tasks for interaction - finding an object's pose in 3D, understanding its various states and transformations, and interpreting its physical interactions. In this thesis, I will focus on two specific aspects of this agenda: 3D object pose estimation and object state understanding. Precise pose estimation is a challenging problem. One reason is that an object's appearance inside an image can vary a lot based on different conditions (e.g. location, occlusions, and lighting). I address these issues by utilizing 3D models directly. The goal is to develop a method that can exploit all possible views provided by a 3D model - a single 3D model represents infinitely many 2D views of the same object. I have developed a method that uses the 3D geometry of an object for pose estimation. The method can then also learn additional real-world statistics, such as which poses appear more frequently, which area is more likely to contain an object, and which parts are commonly occluded and discriminative. These methods allow us to localize and estimate the exact pose of objects in natural images. Finally, I will also describe the work on learning and inferring different states and transformations an object class can undergo. Objects in visual scenes come in a rich variety of transformed states. A few classes of transformation have been heavily studied in computer vision: mostly simple, parametric changes in color and geometry. However, transformations in the physical world occur in many more flavors, and they come with semantic meaning: e.g., bending, folding, aging, etc. Hence, the goal is to learn about an object class, in terms of their states and transformations, using the collection of images from the image search engine. by Joseph J. Lim. Ph. D. 2016-03-03T21:09:54Z 2016-03-03T21:09:54Z 2015 2015 Thesis http://hdl.handle.net/1721.1/101574 940766295 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 xviii, 92 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Lim, Joseph J. (Joseph Jaewhan)
Toward visual understanding of everyday object
title Toward visual understanding of everyday object
title_full Toward visual understanding of everyday object
title_fullStr Toward visual understanding of everyday object
title_full_unstemmed Toward visual understanding of everyday object
title_short Toward visual understanding of everyday object
title_sort toward visual understanding of everyday object
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/101574
work_keys_str_mv AT limjosephjjosephjaewhan towardvisualunderstandingofeverydayobject