Inferring Structured World Models from Videos

Advances in reinforcement learning have allowed agents to learn a variety of board games and video games at superhuman levels. Unlike humans - which can generalize to a wide range of tasks with very little experience - these algorithms typically need vast number of experience replays to perform at t...

Full description

Bibliographic Details
Main Author:	Kapur, Shreyas
Other Authors:	Tenenbaum, Joshua B.
Format:	Thesis
Published:	Massachusetts Institute of Technology 2022
Online Access:	https://hdl.handle.net/1721.1/144497

_version_	1826196106127605760
author	Kapur, Shreyas
author2	Tenenbaum, Joshua B.
author_facet	Tenenbaum, Joshua B. Kapur, Shreyas
author_sort	Kapur, Shreyas
collection	MIT
description	Advances in reinforcement learning have allowed agents to learn a variety of board games and video games at superhuman levels. Unlike humans - which can generalize to a wide range of tasks with very little experience - these algorithms typically need vast number of experience replays to perform at the same level. In this thesis, we propose a model-based reinforcement learning approach that represents the environment using an explicit symbolic model in the form of a domain-specific language (DSL) that represents the world as a set of discrete objects with underlying latent properties that govern their dynamical interactions. We present a novel, neurally guided, on-line inference technique to recover the structured world representation from raw video observations, with the intent to be used for downstream model-based planning. We qualitatively evaluate our inference performance on classical Atari games, as well as on physics-based mobile games.
first_indexed	2024-09-23T10:20:41Z
format	Thesis
id	mit-1721.1/144497
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T10:20:41Z
publishDate	2022
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1444972022-08-30T03:49:24Z Inferring Structured World Models from Videos Kapur, Shreyas Tenenbaum, Joshua B. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Advances in reinforcement learning have allowed agents to learn a variety of board games and video games at superhuman levels. Unlike humans - which can generalize to a wide range of tasks with very little experience - these algorithms typically need vast number of experience replays to perform at the same level. In this thesis, we propose a model-based reinforcement learning approach that represents the environment using an explicit symbolic model in the form of a domain-specific language (DSL) that represents the world as a set of discrete objects with underlying latent properties that govern their dynamical interactions. We present a novel, neurally guided, on-line inference technique to recover the structured world representation from raw video observations, with the intent to be used for downstream model-based planning. We qualitatively evaluate our inference performance on classical Atari games, as well as on physics-based mobile games. M.Eng. 2022-08-29T15:51:35Z 2022-08-29T15:51:35Z 2022-05 2022-05-27T16:18:15.335Z Thesis https://hdl.handle.net/1721.1/144497 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Kapur, Shreyas Inferring Structured World Models from Videos
title	Inferring Structured World Models from Videos
title_full	Inferring Structured World Models from Videos
title_fullStr	Inferring Structured World Models from Videos
title_full_unstemmed	Inferring Structured World Models from Videos
title_short	Inferring Structured World Models from Videos
title_sort	inferring structured world models from videos
url	https://hdl.handle.net/1721.1/144497
work_keys_str_mv	AT kapurshreyas inferringstructuredworldmodelsfromvideos

Inferring Structured World Models from Videos

Similar Items