A Policy Gradient Algorithm to Alleviate the Multi-Agent Value Overestimation Problem in Complex Environments
Multi-agent reinforcement learning excels at addressing group intelligent decision-making problems involving sequential decision-making. In particular, in complex, high-dimensional state and action spaces, it imposes higher demands on the reliability, stability, and adaptability of decision algorith...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/23/9520 |