A Policy Gradient Algorithm to Alleviate the Multi-Agent Value Overestimation Problem in Complex Environments

A Policy Gradient Algorithm to Alleviate the Multi-Agent Value Overestimation Problem in Complex Environments

Multi-agent reinforcement learning excels at addressing group intelligent decision-making problems involving sequential decision-making. In particular, in complex, high-dimensional state and action spaces, it imposes higher demands on the reliability, stability, and adaptability of decision algorith...

Full description

Bibliographic Details
Main Authors:	Yang Yang, Jiang Li, Jinyong Hou, Ye Wang, Huadong Zhao
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	Sensors
Subjects:	deep deterministic policy gradient playback of experience group decision-making overestimation of value function
Online Access:	https://www.mdpi.com/1424-8220/23/23/9520

Similar Items

Research on the Deep Deterministic Policy Algorithm Based on the First-Order Inverted Pendulum
by: Hailin Hu, et al.
Published: (2023-06-01)

Why do we overestimate others' willingness to pay?
by: William J. Matthews, et al.
Published: (2016-01-01)

Overestimated prediction using polygenic prediction derived from summary statistics
by: David Keetae Park, et al.
Published: (2023-09-01)

Correcting Underestimation and Overestimation in PolInSAR Forest Canopy Height Estimation Using Microwave Penetration Depth
by: Hongbin Luo, et al.
Published: (2022-12-01)

Reducing WCET Overestimations in Multi-Thread Loops with Critical Section Usage
by: Simona Ramanauskaite, et al.
Published: (2021-03-01)

Healthy Live Births after the Transfer of Mosaic Embryos: Self-Correction or PGT-A Overestimation?
by: Gerard Campos, et al.
Published: (2023-12-01)

Overestimation of benefit when clinical trials stop early: a simulation study
by: Sharon Liu, et al.
Published: (2022-09-01)

Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes
by: Wenjun Hu, et al.
Published: (2024-03-01)

Poor health literacy associated with stronger perceived barriers to breast cancer screening and overestimated breast cancer risk
by: Paul K. M. Poon, et al.
Published: (2023-01-01)

Agent-Based Energy Sharing Mechanism Using Deep Deterministic Policy Gradient Algorithm
by: Yi Kuang, et al.
Published: (2020-09-01)

Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient
by: Sheng Yu, et al.
Published: (2023-04-01)

A Simple, Test-Based Method to Control the Overestimation Bias in the Analysis of Potential Prognostic Tumour Markers
by: Marzia Ognibene, et al.
Published: (2023-02-01)

First reported case of a longevity overestimation error in the new Medtronic tablet‐based device programmer
by: Pedram Kazemian, et al.
Published: (2020-10-01)

Optimal operation of regional integrated energy system based on multi-agent deep deterministic policy gradient algorithm
by: Bohan Xu, et al.
Published: (2022-11-01)

Reinforcement Learning Your Way: Agent Characterization through Policy Regularization
by: Charl Maree, et al.
Published: (2022-03-01)

“Losses disguised as wins” in electronic gambling machines contribute to win overestimation in a large online sample
by: Dan Myles, et al.
Published: (2023-12-01)

Risky and cautious shifts in group decisions: the influence of widely held values,
by: Stoner, James Arthur Finch
Published: (2009)

Generative Adversarial Inverse Reinforcement Learning With Deep Deterministic Policy Gradient
by: Ming Zhan, et al.
Published: (2023-01-01)

Research on Maneuvering Decision Algorithm Based on Improved Deep Deterministic Policy Gradient
by: Jing Xianyong, et al.
Published: (2022-01-01)

An enhanced deep deterministic policy gradient algorithm for intelligent control of robotic arms
by: Ruyi Dong, et al.
Published: (2023-01-01)

Autonomous Driving of Mobile Robots in Dynamic Environments Based on Deep Deterministic Policy Gradient: Reward Shaping and Hindsight Experience Replay
by: Minjae Park, et al.
Published: (2024-01-01)

How certainty appraisal might improve both body dissatisfaction and body overestimation in anorexia nervosa: a case report
by: M. Metral, et al.
Published: (2018-10-01)

Robotic-Arm-Based Force Control by Deep Deterministic Policy Gradient in Neurosurgical Practice
by: Ibai Inziarte-Hidalgo, et al.
Published: (2023-09-01)

Knowledge Gradient: Capturing Value of Information in Iterative Decisions under Uncertainty
by: Donghun Lee
Published: (2022-11-01)

Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments
by: Minjae Park, et al.
Published: (2022-12-01)

Frequency jumps and subharmonic components in calls of female Odorrana tormota differentially affect the vocal behaviors of male frogs
by: Yatao Wu, et al.
Published: (2023-12-01)

Greater Horseshoe Bats Recognize the Sex and Individual Identity of Conspecifics from Their Echolocation Calls
by: Xiao Tan, et al.
Published: (2022-12-01)

Structural-functional characteristics of two song types in Phylloscopus humei (Phylloscopidae)
by: Svetlana G. Meshcheryagina, et al.
Published: (2023-02-01)

Do task and item difficulty affect overestimation of one’s hand hygiene compliance? A cross-sectional survey of physicians and nurses in surgical clinics of six hospitals in Germany
by: Jonas Lamping, et al.
Published: (2022-12-01)

DDRCN: Deep Deterministic Policy Gradient Recommendation Framework Fused with Deep Cross Networks
by: Tianhan Gao, et al.
Published: (2023-02-01)

A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking
by: Jiying Wu, et al.
Published: (2022-06-01)

Application of a Deep Deterministic Policy Gradient Algorithm for Energy-Aimed Timetable Rescheduling Problem
by: Guang Yang, et al.
Published: (2019-09-01)

Bodily self-recognition and body size overestimation in restrictive anorexia nervosa: implicit and explicit mechanisms
by: Marianna Ambrosecchia, et al.
Published: (2023-07-01)

Autonomous Driving Control Based on the Technique of Semantic Segmentation
by: Jichiang Tsai, et al.
Published: (2023-01-01)

Autonomous Driving Control Using the DDPG and RDPG Algorithms
by: Che-Cheng Chang, et al.
Published: (2021-11-01)

Alarm Calling in Plateau Pika (<i>Ochotona curzoniae</i>): Evidence from Field Observations and Simulated Predator and Playback Experiments
by: Meina Ma, et al.
Published: (2023-04-01)

Female harbor seal (Phoca vitulina) behavioral response to playbacks of underwater male acoustic advertisement displays
by: Leanna P. Matthews, et al.
Published: (2018-03-01)

Appraisal of unimodal cues during agonistic interactions in Maylandia zebra
by: Laura Chabrolles, et al.
Published: (2017-08-01)

Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles
by: Abhishek Gupta, et al.
Published: (2020-10-01)

Autonomous Shape Decision Making of Morphing Aircraft with Improved Reinforcement Learning
by: Weilai Jiang, et al.
Published: (2024-01-01)