Counterfactual multi−agent policy gradients

Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can ef- ficiently learn decentralised policies for such systems. To this...

Full description

Bibliographic Details
Main Authors: Foerster, J, Farquhar, G, Afouras, T, Nardelli, N, Whiteson, S
Format: Conference item
Published: AAAI Press 2018