Basic level scene understanding: categories, attributes and structures

A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper s...

Full description

Bibliographic Details
Main Authors: Patterson, Genevieve, Xiao, Jianxiong, Hays, James, Russell, Bryan Christopher, Ehinger, Krista A, Torralba, Antonio, Oliva, Aude
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Published: Frontiers Media SA 2018
Subjects:
Online Access:http://hdl.handle.net/1721.1/116359
https://orcid.org/0000-0003-4915-0256
_version_ 1826213541383766016
author Patterson, Genevieve
Xiao, Jianxiong
Hays, James
Russell, Bryan Christopher
Ehinger, Krista A
Torralba, Antonio
Oliva, Aude
author2 Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Patterson, Genevieve
Xiao, Jianxiong
Hays, James
Russell, Bryan Christopher
Ehinger, Krista A
Torralba, Antonio
Oliva, Aude
author_sort Patterson, Genevieve
collection MIT
description A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image.
first_indexed 2024-09-23T15:50:56Z
format Article
id mit-1721.1/116359
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T15:50:56Z
publishDate 2018
publisher Frontiers Media SA
record_format dspace
spelling mit-1721.1/1163592022-10-02T04:35:35Z Basic level scene understanding: categories, attributes and structures Patterson, Genevieve Xiao, Jianxiong Hays, James Russell, Bryan Christopher Ehinger, Krista A Torralba, Antonio Oliva, Aude Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Xiao, Jianxiong Hays, James Russell, Bryan Christopher Ehinger, Krista A Torralba, Antonio Oliva, Aude SUN database, basic level scene understanding, scene recognition, scene attributes, geometry recognition, 3D context A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. Google U.S./Canada Ph.D. Fellowship in Computer Vision National Science Foundation (U.S.) (grant 1016862) Google Faculty Research Award National Science Foundation (U.S.) (Career Award 1149853) National Science Foundation (U.S.) (Career Award 0747120) United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933) 2018-06-18T15:51:59Z 2018-06-18T15:51:59Z 2013-08 2018-05-11T13:21:26Z Article http://purl.org/eprint/type/JournalArticle 1664-1078 http://hdl.handle.net/1721.1/116359 Xiao, Jianxiong, James Hays, Bryan C. Russell, Genevieve Patterson, Krista A. Ehinger, Antonio Torralba, and Aude Oliva. “Basic Level Scene Understanding: Categories, Attributes and Structures.” Frontiers in Psychology 4 (2013). https://orcid.org/0000-0003-4915-0256 http://dx.doi.org/10.3389/FPSYG.2013.00506 Frontiers in Psychology Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/ application/pdf Frontiers Media SA Frontiers
spellingShingle SUN database, basic level scene understanding, scene recognition, scene attributes, geometry recognition, 3D context
Patterson, Genevieve
Xiao, Jianxiong
Hays, James
Russell, Bryan Christopher
Ehinger, Krista A
Torralba, Antonio
Oliva, Aude
Basic level scene understanding: categories, attributes and structures
title Basic level scene understanding: categories, attributes and structures
title_full Basic level scene understanding: categories, attributes and structures
title_fullStr Basic level scene understanding: categories, attributes and structures
title_full_unstemmed Basic level scene understanding: categories, attributes and structures
title_short Basic level scene understanding: categories, attributes and structures
title_sort basic level scene understanding categories attributes and structures
topic SUN database, basic level scene understanding, scene recognition, scene attributes, geometry recognition, 3D context
url http://hdl.handle.net/1721.1/116359
https://orcid.org/0000-0003-4915-0256
work_keys_str_mv AT pattersongenevieve basiclevelsceneunderstandingcategoriesattributesandstructures
AT xiaojianxiong basiclevelsceneunderstandingcategoriesattributesandstructures
AT haysjames basiclevelsceneunderstandingcategoriesattributesandstructures
AT russellbryanchristopher basiclevelsceneunderstandingcategoriesattributesandstructures
AT ehingerkristaa basiclevelsceneunderstandingcategoriesattributesandstructures
AT torralbaantonio basiclevelsceneunderstandingcategoriesattributesandstructures
AT olivaaude basiclevelsceneunderstandingcategoriesattributesandstructures