Combining Policy Search with Planning in Multi-agent Cooperation

It is cooperation that essentially differentiates multi-agent systems (MASs) from single-agent intelligence. In realistic MAS applications such as RoboCup, repeated work has shown that traditional machine learning (ML) approaches have difficulty mapping directly from cooperative behaviours to actuat...

Full description

Bibliographic Details
Main Authors:	Ma, J, Cameron, S
Format:	Conference item
Published:	2009

_version_	1826262927975383040
author	Ma, J Cameron, S
author_facet	Ma, J Cameron, S
author_sort	Ma, J
collection	OXFORD
description	It is cooperation that essentially differentiates multi-agent systems (MASs) from single-agent intelligence. In realistic MAS applications such as RoboCup, repeated work has shown that traditional machine learning (ML) approaches have difficulty mapping directly from cooperative behaviours to actuator outputs. To overcome this problem, vertical layered architectures are commonly used to break cooperation down into behavioural layers; ML has then been used to generate different low-level skills, and a planning mechanism added to create high-level cooperation. We propose a novel method called Policy Search Planning (PSP), in which Policy Search is used to find an optimal policy for selecting plans from a plan pool. PSP extends an existing gradient-search method (GPOMDP) to a MAS domain. We demonstrate how PSP can be used in RoboCup Simulation, and our experimental results reveal robustness, adaptivity, and outperformance over other methods. © 2009 Springer Berlin Heidelberg.
first_indexed	2024-03-06T19:43:34Z
format	Conference item
id	oxford-uuid:217ce326-dce5-4423-adbb-946bd20e04dc
institution	University of Oxford
last_indexed	2024-03-06T19:43:34Z
publishDate	2009
record_format	dspace
spelling	oxford-uuid:217ce326-dce5-4423-adbb-946bd20e04dc2022-03-26T11:33:46ZCombining Policy Search with Planning in Multi-agent CooperationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:217ce326-dce5-4423-adbb-946bd20e04dcSymplectic Elements at Oxford2009Ma, JCameron, SIt is cooperation that essentially differentiates multi-agent systems (MASs) from single-agent intelligence. In realistic MAS applications such as RoboCup, repeated work has shown that traditional machine learning (ML) approaches have difficulty mapping directly from cooperative behaviours to actuator outputs. To overcome this problem, vertical layered architectures are commonly used to break cooperation down into behavioural layers; ML has then been used to generate different low-level skills, and a planning mechanism added to create high-level cooperation. We propose a novel method called Policy Search Planning (PSP), in which Policy Search is used to find an optimal policy for selecting plans from a plan pool. PSP extends an existing gradient-search method (GPOMDP) to a MAS domain. We demonstrate how PSP can be used in RoboCup Simulation, and our experimental results reveal robustness, adaptivity, and outperformance over other methods. © 2009 Springer Berlin Heidelberg.
spellingShingle	Ma, J Cameron, S Combining Policy Search with Planning in Multi-agent Cooperation
title	Combining Policy Search with Planning in Multi-agent Cooperation
title_full	Combining Policy Search with Planning in Multi-agent Cooperation
title_fullStr	Combining Policy Search with Planning in Multi-agent Cooperation
title_full_unstemmed	Combining Policy Search with Planning in Multi-agent Cooperation
title_short	Combining Policy Search with Planning in Multi-agent Cooperation
title_sort	combining policy search with planning in multi agent cooperation
work_keys_str_mv	AT maj combiningpolicysearchwithplanninginmultiagentcooperation AT camerons combiningpolicysearchwithplanninginmultiagentcooperation

Combining Policy Search with Planning in Multi-agent Cooperation

Similar Items