Object-Oriented Deep Learning
We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning,...
Main Authors: | , |
---|---|
Format: | Technical Report |
Language: | en_US |
Published: |
Center for Brains, Minds and Machines (CBMM)
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/112103 |
_version_ | 1811090061799718912 |
---|---|
author | Liao, Qianli Poggio, Tomaso |
author_facet | Liao, Qianli Poggio, Tomaso |
author_sort | Liao, Qianli |
collection | MIT |
description | We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for. |
first_indexed | 2024-09-23T14:31:57Z |
format | Technical Report |
id | mit-1721.1/112103 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T14:31:57Z |
publishDate | 2017 |
publisher | Center for Brains, Minds and Machines (CBMM) |
record_format | dspace |
spelling | mit-1721.1/1121032019-04-11T10:08:20Z Object-Oriented Deep Learning Liao, Qianli Poggio, Tomaso AI artificial intelligence neural networks Object-Oriented Deep Learning We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216. 2017-10-31T23:45:35Z 2017-10-31T23:45:35Z 2017-10-31 Technical Report Working Paper Other http://hdl.handle.net/1721.1/112103 en_US CBMM Memo Series;070 Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM) |
spellingShingle | AI artificial intelligence neural networks Object-Oriented Deep Learning Liao, Qianli Poggio, Tomaso Object-Oriented Deep Learning |
title | Object-Oriented Deep Learning |
title_full | Object-Oriented Deep Learning |
title_fullStr | Object-Oriented Deep Learning |
title_full_unstemmed | Object-Oriented Deep Learning |
title_short | Object-Oriented Deep Learning |
title_sort | object oriented deep learning |
topic | AI artificial intelligence neural networks Object-Oriented Deep Learning |
url | http://hdl.handle.net/1721.1/112103 |
work_keys_str_mv | AT liaoqianli objectorienteddeeplearning AT poggiotomaso objectorienteddeeplearning |