Counterfactual multi−agent policy gradients
Many real-world problems, such as network packet routing and the coordination of autonomous vehicles, are naturally modelled as cooperative multi-agent systems. There is a great need for new reinforcement learning methods that can ef- ficiently learn decentralised policies for such systems. To this...
Main Authors: | , , , , |
---|---|
Format: | Conference item |
Published: |
AAAI Press
2018
|