Object-Oriented Deep Learning

We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning,...

Full description

Bibliographic Details
Main Authors: Liao, Qianli, Poggio, Tomaso
Format: Technical Report
Language:en_US
Published: Center for Brains, Minds and Machines (CBMM) 2017
Subjects:
Online Access:http://hdl.handle.net/1721.1/112103
_version_ 1811090061799718912
author Liao, Qianli
Poggio, Tomaso
author_facet Liao, Qianli
Poggio, Tomaso
author_sort Liao, Qianli
collection MIT
description We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for.
first_indexed 2024-09-23T14:31:57Z
format Technical Report
id mit-1721.1/112103
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T14:31:57Z
publishDate 2017
publisher Center for Brains, Minds and Machines (CBMM)
record_format dspace
spelling mit-1721.1/1121032019-04-11T10:08:20Z Object-Oriented Deep Learning Liao, Qianli Poggio, Tomaso AI artificial intelligence neural networks Object-Oriented Deep Learning We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216. 2017-10-31T23:45:35Z 2017-10-31T23:45:35Z 2017-10-31 Technical Report Working Paper Other http://hdl.handle.net/1721.1/112103 en_US CBMM Memo Series;070 Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle AI
artificial intelligence
neural networks
Object-Oriented Deep Learning
Liao, Qianli
Poggio, Tomaso
Object-Oriented Deep Learning
title Object-Oriented Deep Learning
title_full Object-Oriented Deep Learning
title_fullStr Object-Oriented Deep Learning
title_full_unstemmed Object-Oriented Deep Learning
title_short Object-Oriented Deep Learning
title_sort object oriented deep learning
topic AI
artificial intelligence
neural networks
Object-Oriented Deep Learning
url http://hdl.handle.net/1721.1/112103
work_keys_str_mv AT liaoqianli objectorienteddeeplearning
AT poggiotomaso objectorienteddeeplearning