A Policy Gradient Algorithm to Alleviate the Multi-Agent Value Overestimation Problem in Complex Environments
Multi-agent reinforcement learning excels at addressing group intelligent decision-making problems involving sequential decision-making. In particular, in complex, high-dimensional state and action spaces, it imposes higher demands on the reliability, stability, and adaptability of decision algorith...
Main Authors: | Yang Yang, Jiang Li, Jinyong Hou, Ye Wang, Huadong Zhao |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/23/9520 |
Similar Items
-
Research on the Deep Deterministic Policy Algorithm Based on the First-Order Inverted Pendulum
by: Hailin Hu, et al.
Published: (2023-06-01) -
Why do we
overestimate others' willingness to pay?
by: William J. Matthews, et al.
Published: (2016-01-01) -
Overestimated prediction using polygenic prediction derived from summary statistics
by: David Keetae Park, et al.
Published: (2023-09-01) -
Correcting Underestimation and Overestimation in PolInSAR Forest Canopy Height Estimation Using Microwave Penetration Depth
by: Hongbin Luo, et al.
Published: (2022-12-01) -
Reducing WCET Overestimations in Multi-Thread Loops with Critical Section Usage
by: Simona Ramanauskaite, et al.
Published: (2021-03-01)