Generative modeling of dynamic visual scenes

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.

Bibliographic Details
Main Author:	Lin, Dahua, Ph. D. Massachusetts Institute of Technology
Other Authors:	John Fisher.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2013
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/78453

_version_	1811097686793781248
author	Lin, Dahua, Ph. D. Massachusetts Institute of Technology
author2	John Fisher.
author_facet	John Fisher. Lin, Dahua, Ph. D. Massachusetts Institute of Technology
author_sort	Lin, Dahua, Ph. D. Massachusetts Institute of Technology
collection	MIT
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.
first_indexed	2024-09-23T17:03:18Z
format	Thesis
id	mit-1721.1/78453
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T17:03:18Z
publishDate	2013
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/784532019-04-11T02:43:51Z Generative modeling of dynamic visual scenes Lin, Dahua, Ph. D. Massachusetts Institute of Technology John Fisher. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012. Cataloged from PDF version of thesis. Includes bibliographical references (p. 301-312). Modeling visual scenes is one of the fundamental tasks of computer vision. Whereas tremendous efforts have been devoted to video analysis in past decades, most prior work focuses on specific tasks, leading to dedicated methods to solve them. This PhD thesis instead aims to derive a probabilistic generative model that coherently integrates different aspects, notably appearance, motion, and the interaction between them. Specifically, this model considers each video as a composite of dynamic layers, each associated with a covering domain, an appearance template, and a flow describing its motion. These layers change dynamically following the associated flows, and are combined into video frames according to a Z-order that specifies their relative depth-order. To describe these layers and their dynamic changes, three major components are incorporated: (1) An appearance model describes the generative process of the pixel values of a video layer. This model, via the combination of a probabilistic patch manifold and a conditional Markov random field, is able to express rich local details while maintaining global coherence. (2) A motion model captures the motion pattern of a layer through a new concept called geometric flow that originates from differential geometric analysis. A geometric flow unifies the trajectory-based representation and the notion of geometric transformation to represent the collective dynamic behaviors persisting over time. (3) A partial Z-order specifies the relative depth order between layers. Here, through the unique correspondence between equivalent classes of partial orders and consistent choice functions, a distribution over the spaces of partial orders is established, and inference can thus be performed thereon. The development of these models leads to significant challenges in probabilistic modeling and inference that need new techniques to address. We studied two important problems: (1) Both the appearance model and the motion model rely on mixture modeling to capture complex distributions. In a dynamic setting, the components parameters and the number of components in a mixture model can change over time. While the use of Dirichlet processes (DPs) as priors allows indefinite number of components, incorporating temporal dependencies between DPs remains a nontrivial issue, theoretically and practically. Our research on this problem leads to a new construction of dependent DPs, enabling various forms of dynamic variations for nonparametric mixture models by harnessing the connections between Poisson and Dirichlet processes. (2) The inference of partial Z-order from a video needs a method to sample from the posterior distribution of partial orders. A key challenge here is that the underlying space of partial orders is disconnected, meaning that one may not be able to make local updates without violating the combinatorial constraints for partial orders. We developed a novel sampling method to tackle this problem, which dynamically introduces virtual states as bridges to connect between different parts of the space, implicitly resulting in an ergodic Markov chain over an augmented space. With this generative model of visual scenes, many vision problems can be readily solved through inference performed on the model. Empirical experiments demonstrate that this framework yields promising results on a series of practical tasks, including video denoising and inpainting, collective motion analysis, and semantic scene understanding. by Dahua Lin. Ph.D. 2013-04-12T19:25:20Z 2013-04-12T19:25:20Z 2012 2012 Thesis http://hdl.handle.net/1721.1/78453 832618174 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 312 p. application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Lin, Dahua, Ph. D. Massachusetts Institute of Technology Generative modeling of dynamic visual scenes
title	Generative modeling of dynamic visual scenes
title_full	Generative modeling of dynamic visual scenes
title_fullStr	Generative modeling of dynamic visual scenes
title_full_unstemmed	Generative modeling of dynamic visual scenes
title_short	Generative modeling of dynamic visual scenes
title_sort	generative modeling of dynamic visual scenes
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/78453
work_keys_str_mv	AT lindahuaphdmassachusettsinstituteoftechnology generativemodelingofdynamicvisualscenes

Generative modeling of dynamic visual scenes

Similar Items