Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning
Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Springer-Verlag
2013
|
Online Access: | http://hdl.handle.net/1721.1/81483 https://orcid.org/0000-0002-2508-1957 https://orcid.org/0000-0001-8576-1930 |
_version_ | 1826209242210631680 |
---|---|
author | Geramifard, Alborz Redding, Joshua How, Jonathan P. |
author2 | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics |
author_facet | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Geramifard, Alborz Redding, Joshua How, Jonathan P. |
author_sort | Geramifard, Alborz |
collection | MIT |
description | Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g., linear and deterministic dynamics), yet inaccuracies in assumed models will impact the resulting performance. Learning techniques are capable of adapting the model and providing better policies asymptotically compared to cooperative planners, yet they often violate the safety conditions of the system due to their exploratory nature. Moreover they frequently require an impractically large number of interactions to perform well. This paper introduces the intelligent Cooperative Control Architecture (iCCA) as a framework for combining cooperative planners and reinforcement learning techniques. iCCA improves the policy of the cooperative planner, while reduces the risk and sample complexity of the learner. Empirical results in gridworld and task assignment for fuel-limited UAV domains with problem sizes up to 9 billion state-action pairs verify the advantage of iCCA over pure learning and planning strategies. |
first_indexed | 2024-09-23T14:19:44Z |
format | Article |
id | mit-1721.1/81483 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T14:19:44Z |
publishDate | 2013 |
publisher | Springer-Verlag |
record_format | dspace |
spelling | mit-1721.1/814832022-10-01T20:36:51Z Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning Geramifard, Alborz Redding, Joshua How, Jonathan P. Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Geramifard, Alborz How, Jonathan P. Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g., linear and deterministic dynamics), yet inaccuracies in assumed models will impact the resulting performance. Learning techniques are capable of adapting the model and providing better policies asymptotically compared to cooperative planners, yet they often violate the safety conditions of the system due to their exploratory nature. Moreover they frequently require an impractically large number of interactions to perform well. This paper introduces the intelligent Cooperative Control Architecture (iCCA) as a framework for combining cooperative planners and reinforcement learning techniques. iCCA improves the policy of the cooperative planner, while reduces the risk and sample complexity of the learner. Empirical results in gridworld and task assignment for fuel-limited UAV domains with problem sizes up to 9 billion state-action pairs verify the advantage of iCCA over pure learning and planning strategies. 2013-10-23T16:06:34Z 2013-10-23T16:06:34Z 2013-03 2012-08 Article http://purl.org/eprint/type/JournalArticle 0921-0296 1573-0409 http://hdl.handle.net/1721.1/81483 Geramifard, Alborz, Joshua Redding, and Jonathan P. How. “Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning.” Journal of Intelligent & Robotic Systems 72, no. 1 (October 13, 2013): 83-103. https://orcid.org/0000-0002-2508-1957 https://orcid.org/0000-0001-8576-1930 en_US http://dx.doi.org/10.1007/s10846-013-9826-6 Journal of Intelligent & Robotic Systems Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Springer-Verlag MIT web domain |
spellingShingle | Geramifard, Alborz Redding, Joshua How, Jonathan P. Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title | Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title_full | Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title_fullStr | Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title_full_unstemmed | Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title_short | Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning |
title_sort | intelligent cooperative control architecture a framework for performance improvement using safe learning |
url | http://hdl.handle.net/1721.1/81483 https://orcid.org/0000-0002-2508-1957 https://orcid.org/0000-0001-8576-1930 |
work_keys_str_mv | AT geramifardalborz intelligentcooperativecontrolarchitectureaframeworkforperformanceimprovementusingsafelearning AT reddingjoshua intelligentcooperativecontrolarchitectureaframeworkforperformanceimprovementusingsafelearning AT howjonathanp intelligentcooperativecontrolarchitectureaframeworkforperformanceimprovementusingsafelearning |