Projected equation and aggregation-based approximate dynamic programming methods for Tetris

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.

Bibliographic Details
Main Author: Hwang, Daw-sen
Other Authors: Dimitri P. Bertsekas.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2011
Subjects:
Online Access:http://hdl.handle.net/1721.1/66033
_version_ 1826192360418050048
author Hwang, Daw-sen
author2 Dimitri P. Bertsekas.
author_facet Dimitri P. Bertsekas.
Hwang, Daw-sen
author_sort Hwang, Daw-sen
collection MIT
description Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
first_indexed 2024-09-23T09:10:16Z
format Thesis
id mit-1721.1/66033
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T09:10:16Z
publishDate 2011
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/660332019-04-10T21:08:48Z Projected equation and aggregation-based approximate dynamic programming methods for Tetris Approximate dynamic programming : projected equation and aggregation methods for Tetris Hwang, Daw-sen Dimitri P. Bertsekas. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. Cataloged from PDF version of thesis. Includes bibliographical references (p. 65-67). In this thesis, we survey approximate dynamic programming (ADP) methods and test the methods with the game of Tetris. We focus on ADP methods where the cost-to- go function J is approximated with [phi]r, where [phi] is some matrix and r is a vector with relatively low dimension. There are two major categories of methods: projected equation methods and aggregation methods. In projected equation methods, the cost-to-go function approximation [phi]r is updated by simulation using one of several policy-updated algorithms such as LSTD([lambda]) [BB96], and LSPE(A) [B196]. Projected equation methods generally may not converge. We define a pseudometric of policies and view the oscillations of policies in Tetris. Aggregation methods are based on a model approximation approach. The original problem is reduced to an aggregate problem with significantly fewer states. The weight vector r is the cost-to-go function of the aggregate problem and [phi] is the matrix of aggregation probabilities. In aggregation methods, the vector r converges to the optimal cost-to-go function of the aggregate problem. In this thesis, we implement aggregation methods for Tetris, and compare the performance of projected equation methods and aggregation methods. by Daw-sen Hwang. S.M. 2011-09-27T18:34:48Z 2011-09-27T18:34:48Z 2011 2011 Thesis http://hdl.handle.net/1721.1/66033 752149312 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 111 p. application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Hwang, Daw-sen
Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title_full Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title_fullStr Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title_full_unstemmed Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title_short Projected equation and aggregation-based approximate dynamic programming methods for Tetris
title_sort projected equation and aggregation based approximate dynamic programming methods for tetris
topic Electrical Engineering and Computer Science.
url http://hdl.handle.net/1721.1/66033
work_keys_str_mv AT hwangdawsen projectedequationandaggregationbasedapproximatedynamicprogrammingmethodsfortetris
AT hwangdawsen approximatedynamicprogrammingprojectedequationandaggregationmethodsfortetris