Deep convolutional inverse graphics network

This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN mod...

Full description

Bibliographic Details
Main Authors:	Kohli, Pushmeet, Kulkarni, Tejas Dattatraya, Whitney, William F., Tenenbaum, Joshua B
Other Authors:	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
Format:	Article
Published:	Neural Information Processing Systems Foundation, Inc 2017
Online Access:	http://hdl.handle.net/1721.1/112752 https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035

_version_	1826196570892140544
author	Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B
author2	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences
author_facet	Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B
author_sort	Kohli, Pushmeet
collection	MIT
description	This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm [10]. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative tests of the model's efficacy at learning a 3D rendering engine for varied object classes including faces and chairs.
first_indexed	2024-09-23T10:29:12Z
format	Article
id	mit-1721.1/112752
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:29:12Z
publishDate	2017
publisher	Neural Information Processing Systems Foundation, Inc
record_format	dspace
spelling	mit-1721.1/1127522022-09-27T09:47:17Z Deep convolutional inverse graphics network Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm [10]. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative tests of the model's efficacy at learning a 3D rendering engine for varied object classes including faces and chairs. 2017-12-14T15:30:20Z 2017-12-14T15:30:20Z 2015 2017-12-08T17:40:25Z Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/112752 Kulkarni, Tejas D. et al. "Deep convolutional inverse graphics network." Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015), December 7-12 2015, Montreal, Canada, Neural Information Processing Systems Foundation, 2015 © 2015 Neural Information Processing Systems Foundation, Inc https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035 https://papers.nips.cc/paper/5851-deep-convolutional-inverse-graphics-network Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems Foundation, Inc Neural Information Processing Systems (NIPS)
spellingShingle	Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B Deep convolutional inverse graphics network
title	Deep convolutional inverse graphics network
title_full	Deep convolutional inverse graphics network
title_fullStr	Deep convolutional inverse graphics network
title_full_unstemmed	Deep convolutional inverse graphics network
title_short	Deep convolutional inverse graphics network
title_sort	deep convolutional inverse graphics network
url	http://hdl.handle.net/1721.1/112752 https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035
work_keys_str_mv	AT kohlipushmeet deepconvolutionalinversegraphicsnetwork AT kulkarnitejasdattatraya deepconvolutionalinversegraphicsnetwork AT whitneywilliamf deepconvolutionalinversegraphicsnetwork AT tenenbaumjoshuab deepconvolutionalinversegraphicsnetwork

Deep convolutional inverse graphics network

Similar Items