Scheduling of costly measurements for state estimation using reinforcement learning

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999.

Bibliographic Details
Main Author:	Rogers, Keith Eric
Other Authors:	Wallace E. Vander Velde.
Format:	Thesis
Language:	en_US
Published:	Massachusetts Institute of Technology 2005
Subjects:	Aeronautics and Astronautics.
Online Access:	http://hdl.handle.net/1721.1/28216

_version_	1826213750569435136
author	Rogers, Keith Eric
author2	Wallace E. Vander Velde.
author_facet	Wallace E. Vander Velde. Rogers, Keith Eric
author_sort	Rogers, Keith Eric
collection	MIT
description	Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999.
first_indexed	2024-09-23T15:54:13Z
format	Thesis
id	mit-1721.1/28216
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T15:54:13Z
publishDate	2005
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/282162019-04-12T21:58:15Z Scheduling of costly measurements for state estimation using reinforcement learning Rogers, Keith Eric Wallace E. Vander Velde. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Aeronautics and Astronautics. Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999. Includes bibliographical references (p. 257-263). There has long been a significant gap between the theory and practice of measurement scheduling for state estimation problems. Theoretical papers tend to deal rigorously with small-scale, linear problems using methods that are well-grounded in optimization theory. Practical applications deal with high-dimensional, nonlinear problems using heuristic policies. The work in this thesis attempts to bridge that gap by using reinforcement learning (RL) to treat real-world problems. In doing so, it makes contributions to the fields of both measurement scheduling and RL. On the measurement scheduling side, a unified formulation is presented which encompasses the wide variety of problems found in the literature as well as more complex variations. This is used with RL to handle a series of problems of increasing difficulty. Both continuous and discrete action spaces are treated, and RL is shown to be effective with both. The RL-based methods are shown to beat alternative methods from the literature in one case, and are able to consistently match or beat heuristics for both high-dimensional linear problems and simple nonlinear problems. Finally, RL is applied to a high-dimensional nonlinear problem in radar tracking and is able to outperform the best available heuristic by as much as 35%. In treating these problems, it is shown that a useful synergy exists between learned and heuristic policies, with each helping to verify and improve the performance of the other. On the reinforcement learning side, the contribution comes mainly from applying the algorithms in an extremely adverse environment. The measurement scheduling problems treated involve high-dimensional, continuous input spaces and continuous action spaces. The nonlinear cases must use sub-optimal nonlinear filters and are hence non-Markovian. Cost feedback comes in terms of internally propagated states with a sometimes tenuous connection to the environment. In a field where typical applications have both finite state spaces and finite action spaces, these problems test the limits of its usability. Some advances are also made in the treatment of problems where the cost differential is much smaller in the action direction than the state direction. Learning algorithms are presented for a class of transformations to Bellman's equation, of which Advantage Learning represents a special case. Conditions under which Advantage Learning may diverge are described, and an alternative algorithm - called G-Learning - is given which fixes the problem for a sample case. by Keith Eric Rogers. Ph.D. 2005-09-26T19:10:51Z 2005-09-26T19:10:51Z 1999 1999 Thesis http://hdl.handle.net/1721.1/28216 44599094 en_US M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 263 p. 13117662 bytes 13152680 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
spellingShingle	Aeronautics and Astronautics. Rogers, Keith Eric Scheduling of costly measurements for state estimation using reinforcement learning
title	Scheduling of costly measurements for state estimation using reinforcement learning
title_full	Scheduling of costly measurements for state estimation using reinforcement learning
title_fullStr	Scheduling of costly measurements for state estimation using reinforcement learning
title_full_unstemmed	Scheduling of costly measurements for state estimation using reinforcement learning
title_short	Scheduling of costly measurements for state estimation using reinforcement learning
title_sort	scheduling of costly measurements for state estimation using reinforcement learning
topic	Aeronautics and Astronautics.
url	http://hdl.handle.net/1721.1/28216
work_keys_str_mv	AT rogerskeitheric schedulingofcostlymeasurementsforstateestimationusingreinforcementlearning

Scheduling of costly measurements for state estimation using reinforcement learning

Similar Items