Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach

We propose controller synthesis for state regulation problems in which a human operator shares control with an autonomy system, running in parallel. The autonomy system continuously improves over human action, with minimal intervention, and can take over full-control if necessary. It additively comb...

Full description

Bibliographic Details
Main Authors:	Abu-Khalaf, Murad, Karaman, Sertac, Rus, Daniela
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	English
Published:	IEEE 2021
Online Access:	https://hdl.handle.net/1721.1/137170

_version_	1826212754506121216
author	Abu-Khalaf, Murad Karaman, Sertac Rus, Daniela
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Abu-Khalaf, Murad Karaman, Sertac Rus, Daniela
author_sort	Abu-Khalaf, Murad
collection	MIT
description	We propose controller synthesis for state regulation problems in which a human operator shares control with an autonomy system, running in parallel. The autonomy system continuously improves over human action, with minimal intervention, and can take over full-control if necessary. It additively combines user input with an adaptive optimal corrective signal to drive the plant. It is adaptive in the sense that it neither estimates nor requires a model of the human's action policy, or the internal dynamics of the plant, and can adjust to changes in both. Our contribution is twofold; first, a new controller synthesis for shared control which we formulate as an adaptive optimal control problem for continuous-time linear systems and solve it online as a human-in-the-loop reinforcement learning. The result is an architecture that we call shared linear quadratic regulator (sLQR). Second, we provide new analysis of reinforcement learning for continuous-time linear systems in two parts. In the first analysis part, we avoid learning along a single state-space trajectory which we show leads to data collinearity under certain conditions. In doing so, we make a clear separation between exploitation of learned policies and exploration of the state-space, and propose an exploration scheme that requires switching to new state-space trajectories rather than injecting noise continuously while learning. This avoidance of continuous noise injection minimizes interference with human action, and avoids bias in the convergence to the stabilizing solution of the underlying algebraic Riccati equation. We show that exploring a minimum number of pairwise distinct state-space trajectories is necessary to avoid collinearity in the learning data. In the second analysis part, we show conditions under which existence and uniqueness of solutions can be established for off-policy reinforcement learning in continuous-time linear systems; namely, prior knowledge of the input matrix.
first_indexed	2024-09-23T15:37:22Z
format	Article
id	mit-1721.1/137170
institution	Massachusetts Institute of Technology
language	English
last_indexed	2024-09-23T15:37:22Z
publishDate	2021
publisher	IEEE
record_format	dspace
spelling	mit-1721.1/1371702023-02-10T19:54:48Z Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach Abu-Khalaf, Murad Karaman, Sertac Rus, Daniela Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Laboratory for Information and Decision Systems We propose controller synthesis for state regulation problems in which a human operator shares control with an autonomy system, running in parallel. The autonomy system continuously improves over human action, with minimal intervention, and can take over full-control if necessary. It additively combines user input with an adaptive optimal corrective signal to drive the plant. It is adaptive in the sense that it neither estimates nor requires a model of the human's action policy, or the internal dynamics of the plant, and can adjust to changes in both. Our contribution is twofold; first, a new controller synthesis for shared control which we formulate as an adaptive optimal control problem for continuous-time linear systems and solve it online as a human-in-the-loop reinforcement learning. The result is an architecture that we call shared linear quadratic regulator (sLQR). Second, we provide new analysis of reinforcement learning for continuous-time linear systems in two parts. In the first analysis part, we avoid learning along a single state-space trajectory which we show leads to data collinearity under certain conditions. In doing so, we make a clear separation between exploitation of learned policies and exploration of the state-space, and propose an exploration scheme that requires switching to new state-space trajectories rather than injecting noise continuously while learning. This avoidance of continuous noise injection minimizes interference with human action, and avoids bias in the convergence to the stabilizing solution of the underlying algebraic Riccati equation. We show that exploring a minimum number of pairwise distinct state-space trajectories is necessary to avoid collinearity in the learning data. In the second analysis part, we show conditions under which existence and uniqueness of solutions can be established for off-policy reinforcement learning in continuous-time linear systems; namely, prior knowledge of the input matrix. 2021-11-02T18:59:32Z 2021-11-02T18:59:32Z 2019-12 2021-04-15T17:35:15Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137170 Abu-Khalaf, Murad, Karaman, Sertac and Rus, Daniela. 2019. "Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach." Proceedings of the IEEE Conference on Decision and Control, 2019-December. en 10.1109/cdc40024.2019.9029617 Proceedings of the IEEE Conference on Decision and Control Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE arXiv
spellingShingle	Abu-Khalaf, Murad Karaman, Sertac Rus, Daniela Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title	Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title_full	Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title_fullStr	Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title_full_unstemmed	Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title_short	Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach
title_sort	shared linear quadratic regulation control a reinforcement learning approach
url	https://hdl.handle.net/1721.1/137170
work_keys_str_mv	AT abukhalafmurad sharedlinearquadraticregulationcontrolareinforcementlearningapproach AT karamansertac sharedlinearquadraticregulationcontrolareinforcementlearningapproach AT rusdaniela sharedlinearquadraticregulationcontrolareinforcementlearningapproach

Shared Linear Quadratic Regulation Control: A Reinforcement Learning Approach

Similar Items