Full interpretation of minimal images

The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpret...

Full description

Bibliographic Details
Main Authors: Ben-Yosef, Guy, Assif, Liav, Ullman, Shimon
Format: Technical Report
Language:en_US
Published: Center for Brains, Minds and Machines (CBMM) 2017
Subjects:
Online Access:http://hdl.handle.net/1721.1/106887
_version_ 1811074289956290560
author Ben-Yosef, Guy
Assif, Liav
Ullman, Shimon
author_facet Ben-Yosef, Guy
Assif, Liav
Ullman, Shimon
author_sort Ben-Yosef, Guy
collection MIT
description The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. In such reduced regions, interpretation is simpler, since the number of semantic components is small, and the variability of possible configurations is low. We model the interpretation process by identifying primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of ‘minimal configurations’: these are reduced local regions, which are minimal in the sense that further reduction renders them unrecognizable and uninterpretable. We show that such minimal interpretable images have useful properties, which we use to identify informative features and relations used for full interpretation. We describe our interpretation model, and show results of detailed interpretations of minimal configurations, produced automatically by the model. Finally, we discuss implications of full interpretation to difficult visual tasks, such as recognizing human activities or interactions, which are beyond the scope of current models of visual recognition.
first_indexed 2024-09-23T09:46:43Z
format Technical Report
id mit-1721.1/106887
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T09:46:43Z
publishDate 2017
publisher Center for Brains, Minds and Machines (CBMM)
record_format dspace
spelling mit-1721.1/1068872019-09-12T22:58:43Z Full interpretation of minimal images Ben-Yosef, Guy Assif, Liav Ullman, Shimon Image interpretation Visual object recognition Parts and relations The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. In such reduced regions, interpretation is simpler, since the number of semantic components is small, and the variability of possible configurations is low. We model the interpretation process by identifying primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of ‘minimal configurations’: these are reduced local regions, which are minimal in the sense that further reduction renders them unrecognizable and uninterpretable. We show that such minimal interpretable images have useful properties, which we use to identify informative features and relations used for full interpretation. We describe our interpretation model, and show results of detailed interpretations of minimal configurations, produced automatically by the model. Finally, we discuss implications of full interpretation to difficult visual tasks, such as recognizing human activities or interactions, which are beyond the scope of current models of visual recognition. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. 2017-02-08T19:25:35Z 2017-02-08T19:25:35Z 2017-02-08 Technical Report Working Paper Other http://hdl.handle.net/1721.1/106887 en_US CBMM Memo Series;061 Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle Image interpretation
Visual object recognition
Parts and relations
Ben-Yosef, Guy
Assif, Liav
Ullman, Shimon
Full interpretation of minimal images
title Full interpretation of minimal images
title_full Full interpretation of minimal images
title_fullStr Full interpretation of minimal images
title_full_unstemmed Full interpretation of minimal images
title_short Full interpretation of minimal images
title_sort full interpretation of minimal images
topic Image interpretation
Visual object recognition
Parts and relations
url http://hdl.handle.net/1721.1/106887
work_keys_str_mv AT benyosefguy fullinterpretationofminimalimages
AT assifliav fullinterpretationofminimalimages
AT ullmanshimon fullinterpretationofminimalimages