Toward visual understanding of everyday object
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/101574 |
_version_ | 1811094927566700544 |
---|---|
author | Lim, Joseph J. (Joseph Jaewhan) |
author2 | Antonio Torralba. |
author_facet | Antonio Torralba. Lim, Joseph J. (Joseph Jaewhan) |
author_sort | Lim, Joseph J. (Joseph Jaewhan) |
collection | MIT |
description | Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. |
first_indexed | 2024-09-23T16:07:44Z |
format | Thesis |
id | mit-1721.1/101574 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T16:07:44Z |
publishDate | 2016 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1015742019-04-11T03:23:06Z Toward visual understanding of everyday object Lim, Joseph J. (Joseph Jaewhan) Antonio Torralba. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. Cataloged from PDF version of thesis. Includes bibliographical references (pages 83-92). The computer vision community has made impressive progress on object recognition using large scale data. However, for any visual system to interact with objects, it needs to understand much more than simply recognizing where the objects are. The goal of my research is to explore and solve object understanding tasks for interaction - finding an object's pose in 3D, understanding its various states and transformations, and interpreting its physical interactions. In this thesis, I will focus on two specific aspects of this agenda: 3D object pose estimation and object state understanding. Precise pose estimation is a challenging problem. One reason is that an object's appearance inside an image can vary a lot based on different conditions (e.g. location, occlusions, and lighting). I address these issues by utilizing 3D models directly. The goal is to develop a method that can exploit all possible views provided by a 3D model - a single 3D model represents infinitely many 2D views of the same object. I have developed a method that uses the 3D geometry of an object for pose estimation. The method can then also learn additional real-world statistics, such as which poses appear more frequently, which area is more likely to contain an object, and which parts are commonly occluded and discriminative. These methods allow us to localize and estimate the exact pose of objects in natural images. Finally, I will also describe the work on learning and inferring different states and transformations an object class can undergo. Objects in visual scenes come in a rich variety of transformed states. A few classes of transformation have been heavily studied in computer vision: mostly simple, parametric changes in color and geometry. However, transformations in the physical world occur in many more flavors, and they come with semantic meaning: e.g., bending, folding, aging, etc. Hence, the goal is to learn about an object class, in terms of their states and transformations, using the collection of images from the image search engine. by Joseph J. Lim. Ph. D. 2016-03-03T21:09:54Z 2016-03-03T21:09:54Z 2015 2015 Thesis http://hdl.handle.net/1721.1/101574 940766295 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 xviii, 92 pages application/pdf Massachusetts Institute of Technology |
spellingShingle | Electrical Engineering and Computer Science. Lim, Joseph J. (Joseph Jaewhan) Toward visual understanding of everyday object |
title | Toward visual understanding of everyday object |
title_full | Toward visual understanding of everyday object |
title_fullStr | Toward visual understanding of everyday object |
title_full_unstemmed | Toward visual understanding of everyday object |
title_short | Toward visual understanding of everyday object |
title_sort | toward visual understanding of everyday object |
topic | Electrical Engineering and Computer Science. |
url | http://hdl.handle.net/1721.1/101574 |
work_keys_str_mv | AT limjosephjjosephjaewhan towardvisualunderstandingofeverydayobject |