Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks

IEEE We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes...

Full description

Bibliographic Details
Main Authors: Xue, Tianfan, Wu, Jiajun, Bouman, Katherine L, Freeman, William T
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE) 2021
Online Access:https://hdl.handle.net/1721.1/135825
_version_ 1811096063722913792
author Xue, Tianfan
Wu, Jiajun
Bouman, Katherine L
Freeman, William T
author_facet Xue, Tianfan
Wu, Jiajun
Bouman, Katherine L
Freeman, William T
author_sort Xue, Tianfan
collection MIT
description IEEE We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and motion. We also demonstrate a few of its applications, including visual analogy-making and video extrapolation.
first_indexed 2024-09-23T16:37:46Z
format Article
id mit-1721.1/135825
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T16:37:46Z
publishDate 2021
publisher Institute of Electrical and Electronics Engineers (IEEE)
record_format dspace
spelling mit-1721.1/1358252022-04-01T17:13:08Z Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks Xue, Tianfan Wu, Jiajun Bouman, Katherine L Freeman, William T IEEE We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively. In experiments, our model performs well on synthetic data, such as 2D shapes and animated game sprites, and on real-world video frames. We present analyses of the learned network representations, showing it is implicitly learning a compact encoding of object appearance and motion. We also demonstrate a few of its applications, including visual analogy-making and video extrapolation. 2021-10-27T20:29:30Z 2021-10-27T20:29:30Z 2019 2019-05-28T12:00:32Z Article http://purl.org/eprint/type/JournalArticle https://hdl.handle.net/1721.1/135825 en 10.1109/TPAMI.2018.2854726 IEEE Transactions on Pattern Analysis and Machine Intelligence Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) MIT web domain
spellingShingle Xue, Tianfan
Wu, Jiajun
Bouman, Katherine L
Freeman, William T
Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title_full Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title_fullStr Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title_full_unstemmed Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title_short Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks
title_sort visual dynamics stochastic future generation via layered cross convolutional networks
url https://hdl.handle.net/1721.1/135825
work_keys_str_mv AT xuetianfan visualdynamicsstochasticfuturegenerationvialayeredcrossconvolutionalnetworks
AT wujiajun visualdynamicsstochasticfuturegenerationvialayeredcrossconvolutionalnetworks
AT boumankatherinel visualdynamicsstochasticfuturegenerationvialayeredcrossconvolutionalnetworks
AT freemanwilliamt visualdynamicsstochasticfuturegenerationvialayeredcrossconvolutionalnetworks