Object-Oriented Deep Learning

We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning,...

Full description

Bibliographic Details
Main Authors:	Liao, Qianli, Poggio, Tomaso
Format:	Technical Report
Language:	en_US
Published:	Center for Brains, Minds and Machines (CBMM) 2017
Subjects:	AI artificial intelligence neural networks Object-Oriented Deep Learning
Online Access:	http://hdl.handle.net/1721.1/112103

_version_	1811090061799718912
author	Liao, Qianli Poggio, Tomaso
author_facet	Liao, Qianli Poggio, Tomaso
author_sort	Liao, Qianli
collection	MIT
description	We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for.
first_indexed	2024-09-23T14:31:57Z
format	Technical Report
id	mit-1721.1/112103
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T14:31:57Z
publishDate	2017
publisher	Center for Brains, Minds and Machines (CBMM)
record_format	dspace
spelling	mit-1721.1/1121032019-04-11T10:08:20Z Object-Oriented Deep Learning Liao, Qianli Poggio, Tomaso AI artificial intelligence neural networks Object-Oriented Deep Learning We investigate an unconventional direction of research that aims at converting neural networks, a class of distributed, connectionist, sub-symbolic models into a symbolic level with the ultimate goal of achieving AI interpretability and safety. To that end, we propose Object-Oriented Deep Learning, a novel computational paradigm of deep learning that adopts interpretable “objects/symbols” as a basic representational atom instead of N-dimensional tensors (as in traditional “feature-oriented” deep learning). For visual processing, each “object/symbol” can explicitly package common properties of visual objects like its position, pose, scale, probability of being an object, pointers to parts, etc., providing a full spectrum of interpretable visual knowledge throughout all layers. It achieves a form of “symbolic disentanglement”, offering one solution to the important problem of disentangled representations and invariance. Basic computations of the network include predicting high-level objects and their properties from low-level objects and binding/aggregating relevant objects together. These computations operate at a more fundamental level than convolutions, capturing convolution as a special case while being significantly more general than it. All operations are executed in an input-driven fashion, thus sparsity and dynamic computation per sample are naturally supported, complementing recent popular ideas of dynamic networks and may enable new types of hardware accelerations. We experimentally show on CIFAR-10 that it can perform flexible visual processing, rivaling the performance of ConvNet, but without using any convolution. Furthermore, it can generalize to novel rotations of images that it was not trained for. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216. 2017-10-31T23:45:35Z 2017-10-31T23:45:35Z 2017-10-31 Technical Report Working Paper Other http://hdl.handle.net/1721.1/112103 en_US CBMM Memo Series;070 Attribution-NonCommercial-ShareAlike 3.0 United States http://creativecommons.org/licenses/by-nc-sa/3.0/us/ application/pdf Center for Brains, Minds and Machines (CBMM)
spellingShingle	AI artificial intelligence neural networks Object-Oriented Deep Learning Liao, Qianli Poggio, Tomaso Object-Oriented Deep Learning
title	Object-Oriented Deep Learning
title_full	Object-Oriented Deep Learning
title_fullStr	Object-Oriented Deep Learning
title_full_unstemmed	Object-Oriented Deep Learning
title_short	Object-Oriented Deep Learning
title_sort	object oriented deep learning
topic	AI artificial intelligence neural networks Object-Oriented Deep Learning
url	http://hdl.handle.net/1721.1/112103
work_keys_str_mv	AT liaoqianli objectorienteddeeplearning AT poggiotomaso objectorienteddeeplearning

Object-Oriented Deep Learning

Similar Items