Scheduling of costly measurements for state estimation using reinforcement learning

Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999.

Bibliographic Details
Main Author: Rogers, Keith Eric
Other Authors: Wallace E. Vander Velde.
Format: Thesis
Language:en_US
Published: Massachusetts Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1721.1/28216
_version_ 1826213750569435136
author Rogers, Keith Eric
author2 Wallace E. Vander Velde.
author_facet Wallace E. Vander Velde.
Rogers, Keith Eric
author_sort Rogers, Keith Eric
collection MIT
description Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999.
first_indexed 2024-09-23T15:54:13Z
format Thesis
id mit-1721.1/28216
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:54:13Z
publishDate 2005
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/282162019-04-12T21:58:15Z Scheduling of costly measurements for state estimation using reinforcement learning Rogers, Keith Eric Wallace E. Vander Velde. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Massachusetts Institute of Technology. Dept. of Aeronautics and Astronautics. Aeronautics and Astronautics. Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999. Includes bibliographical references (p. 257-263). There has long been a significant gap between the theory and practice of measurement scheduling for state estimation problems. Theoretical papers tend to deal rigorously with small-scale, linear problems using methods that are well-grounded in optimization theory. Practical applications deal with high-dimensional, nonlinear problems using heuristic policies. The work in this thesis attempts to bridge that gap by using reinforcement learning (RL) to treat real-world problems. In doing so, it makes contributions to the fields of both measurement scheduling and RL. On the measurement scheduling side, a unified formulation is presented which encompasses the wide variety of problems found in the literature as well as more complex variations. This is used with RL to handle a series of problems of increasing difficulty. Both continuous and discrete action spaces are treated, and RL is shown to be effective with both. The RL-based methods are shown to beat alternative methods from the literature in one case, and are able to consistently match or beat heuristics for both high-dimensional linear problems and simple nonlinear problems. Finally, RL is applied to a high-dimensional nonlinear problem in radar tracking and is able to outperform the best available heuristic by as much as 35%. In treating these problems, it is shown that a useful synergy exists between learned and heuristic policies, with each helping to verify and improve the performance of the other. On the reinforcement learning side, the contribution comes mainly from applying the algorithms in an extremely adverse environment. The measurement scheduling problems treated involve high-dimensional, continuous input spaces and continuous action spaces. The nonlinear cases must use sub-optimal nonlinear filters and are hence non-Markovian. Cost feedback comes in terms of internally propagated states with a sometimes tenuous connection to the environment. In a field where typical applications have both finite state spaces and finite action spaces, these problems test the limits of its usability. Some advances are also made in the treatment of problems where the cost differential is much smaller in the action direction than the state direction. Learning algorithms are presented for a class of transformations to Bellman's equation, of which Advantage Learning represents a special case. Conditions under which Advantage Learning may diverge are described, and an alternative algorithm - called G-Learning - is given which fixes the problem for a sample case. by Keith Eric Rogers. Ph.D. 2005-09-26T19:10:51Z 2005-09-26T19:10:51Z 1999 1999 Thesis http://hdl.handle.net/1721.1/28216 44599094 en_US M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 263 p. 13117662 bytes 13152680 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
spellingShingle Aeronautics and Astronautics.
Rogers, Keith Eric
Scheduling of costly measurements for state estimation using reinforcement learning
title Scheduling of costly measurements for state estimation using reinforcement learning
title_full Scheduling of costly measurements for state estimation using reinforcement learning
title_fullStr Scheduling of costly measurements for state estimation using reinforcement learning
title_full_unstemmed Scheduling of costly measurements for state estimation using reinforcement learning
title_short Scheduling of costly measurements for state estimation using reinforcement learning
title_sort scheduling of costly measurements for state estimation using reinforcement learning
topic Aeronautics and Astronautics.
url http://hdl.handle.net/1721.1/28216
work_keys_str_mv AT rogerskeitheric schedulingofcostlymeasurementsforstateestimationusingreinforcementlearning