Deep convolutional inverse graphics network
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN mod...
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
Neural Information Processing Systems Foundation, Inc
2017
|
Online Access: | http://hdl.handle.net/1721.1/112752 https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035 |
_version_ | 1826196570892140544 |
---|---|
author | Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B |
author2 | Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences |
author_facet | Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B |
author_sort | Kohli, Pushmeet |
collection | MIT |
description | This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm [10]. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative tests of the model's efficacy at learning a 3D rendering engine for varied object classes including faces and chairs. |
first_indexed | 2024-09-23T10:29:12Z |
format | Article |
id | mit-1721.1/112752 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:29:12Z |
publishDate | 2017 |
publisher | Neural Information Processing Systems Foundation, Inc |
record_format | dspace |
spelling | mit-1721.1/1127522022-09-27T09:47:17Z Deep convolutional inverse graphics network Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that aims to learn an interpretable representation of images, disentangled with respect to three-dimensional scene structure and viewing transformations such as depth rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm [10]. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative tests of the model's efficacy at learning a 3D rendering engine for varied object classes including faces and chairs. 2017-12-14T15:30:20Z 2017-12-14T15:30:20Z 2015 2017-12-08T17:40:25Z Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/112752 Kulkarni, Tejas D. et al. "Deep convolutional inverse graphics network." Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015), December 7-12 2015, Montreal, Canada, Neural Information Processing Systems Foundation, 2015 © 2015 Neural Information Processing Systems Foundation, Inc https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035 https://papers.nips.cc/paper/5851-deep-convolutional-inverse-graphics-network Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS 2015) Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Neural Information Processing Systems Foundation, Inc Neural Information Processing Systems (NIPS) |
spellingShingle | Kohli, Pushmeet Kulkarni, Tejas Dattatraya Whitney, William F. Tenenbaum, Joshua B Deep convolutional inverse graphics network |
title | Deep convolutional inverse graphics network |
title_full | Deep convolutional inverse graphics network |
title_fullStr | Deep convolutional inverse graphics network |
title_full_unstemmed | Deep convolutional inverse graphics network |
title_short | Deep convolutional inverse graphics network |
title_sort | deep convolutional inverse graphics network |
url | http://hdl.handle.net/1721.1/112752 https://orcid.org/0000-0002-7077-2765 https://orcid.org/0000-0002-0628-6789 https://orcid.org/0000-0002-1925-2035 |
work_keys_str_mv | AT kohlipushmeet deepconvolutionalinversegraphicsnetwork AT kulkarnitejasdattatraya deepconvolutionalinversegraphicsnetwork AT whitneywilliamf deepconvolutionalinversegraphicsnetwork AT tenenbaumjoshuab deepconvolutionalinversegraphicsnetwork |