Full interpretation of minimal images
The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpret...
Main Authors: | , , |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM)
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/106887 |
_version_ | 1811074289956290560 |
---|---|
author | Ben-Yosef, Guy Assif, Liav Ullman, Shimon |
author_facet | Ben-Yosef, Guy Assif, Liav Ullman, Shimon |
author_sort | Ben-Yosef, Guy |
collection | MIT |
description | The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. In such reduced regions, interpretation is simpler, since the number of semantic components is small, and the variability of possible configurations is low.
We model the interpretation process by identifying primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of ‘minimal configurations’: these are reduced local regions, which are minimal in the sense that further reduction renders them unrecognizable and uninterpretable. We show that such minimal interpretable images have useful properties, which we use to identify informative features and relations used for full interpretation. We describe our interpretation model, and show results of detailed interpretations of minimal configurations, produced automatically by the model. Finally, we discuss implications of full interpretation to difficult visual tasks, such as recognizing human activities or interactions, which are beyond the scope of current models of visual recognition. |
first_indexed | 2024-09-23T09:46:43Z |
format | Technical Report |
id | mit-1721.1/106887 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T09:46:43Z |
publishDate | 2017 |
publisher | Center for Brains, Minds and Machines (CBMM) |
record_format | dspace |
spelling | mit-1721.1/1068872019-09-12T22:58:43Z Full interpretation of minimal images Ben-Yosef, Guy Assif, Liav Ullman, Shimon Image interpretation Visual object recognition Parts and relations The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. In such reduced regions, interpretation is simpler, since the number of semantic components is small, and the variability of possible configurations is low. We model the interpretation process by identifying primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of ‘minimal configurations’: these are reduced local regions, which are minimal in the sense that further reduction renders them unrecognizable and uninterpretable. We show that such minimal interpretable images have useful properties, which we use to identify informative features and relations used for full interpretation. We describe our interpretation model, and show results of detailed interpretations of minimal configurations, produced automatically by the model. Finally, we discuss implications of full interpretation to difficult visual tasks, such as recognizing human activities or interactions, which are beyond the scope of current models of visual recognition. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216. 2017-02-08T19:25:35Z 2017-02-08T19:25:35Z 2017-02-08 Technical Report Working Paper Other http://hdl.handle.net/1721.1/106887 en_US CBMM Memo Series;061 Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM) |
spellingShingle | Image interpretation Visual object recognition Parts and relations Ben-Yosef, Guy Assif, Liav Ullman, Shimon Full interpretation of minimal images |
title | Full interpretation of minimal images |
title_full | Full interpretation of minimal images |
title_fullStr | Full interpretation of minimal images |
title_full_unstemmed | Full interpretation of minimal images |
title_short | Full interpretation of minimal images |
title_sort | full interpretation of minimal images |
topic | Image interpretation Visual object recognition Parts and relations |
url | http://hdl.handle.net/1721.1/106887 |
work_keys_str_mv | AT benyosefguy fullinterpretationofminimalimages AT assifliav fullinterpretationofminimalimages AT ullmanshimon fullinterpretationofminimalimages |