Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines brightly in this type of team electronic game,...

Full description

Bibliographic Details
Main Authors:	Zheng Li, Xinkai Chen, Jiaqing Fu, Ning Xie, Tingting Zhao
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Algorithms
Subjects:	reinforcement learning game AI multi-agent Q-network mutual estimation softmax bellman operation reinforcement learning environment
Online Access:	https://www.mdpi.com/1999-4893/17/1/36

Similar Items

Hardware Implementation of a Softmax-Like Function for Deep Learning
by: Ioannis Kouretas, et al.
Published: (2020-08-01)

Techniques and Paradigms in Modern Game AI Systems
by: Yunlong Lu, et al.
Published: (2022-08-01)

Hybrid-Margin Softmax for the Detection of Trademark Image Similarity
by: Chenyang Wang, et al.
Published: (2024-03-01)

A Low-Voltage, Low-Power Reconfigurable Current-Mode Softmax Circuit for Analog Neural Networks
by: Massimo Vatalaro, et al.
Published: (2021-04-01)

Angular Margin-Mining Softmax Loss for Face Recognition
by: Jwajin Lee, et al.
Published: (2022-01-01)

Leveraging Uncertainties in Softmax Decision-Making Models for Low-Power IoT Devices
by: Chiwoo Cho, et al.
Published: (2020-08-01)

Approaches That Use Domain-Specific Expertise: Behavioral-Cloning-Based Advantage Actor-Critic in Basketball Games
by: Taehyeok Choi, et al.
Published: (2023-02-01)

Using VizDoom Research Platform Scenarios for Benchmarking Reinforcement Learning Algorithms in First-Person Shooter Games
by: Adil Khan, et al.
Published: (2024-01-01)

Energy-Efficient Multi-UAVs Cooperative Trajectory Optimization for Communication Coverage: An MADRL Approach
by: Tianyong Ao, et al.
Published: (2023-01-01)

Official International Mahjong: A New Playground for AI Research
by: Yunlong Lu, et al.
Published: (2023-04-01)

A Novel Actor—Critic Motor Reinforcement Learning for Continuum Soft Robots
by: Luis Pantoja-Garcia, et al.
Published: (2023-10-01)

An Empirical Evaluation of Enhanced Performance Softmax Function in Deep Learning
by: Sumiran Mehra, et al.
Published: (2023-01-01)

Vehicle Re-Identification in Aerial Imagery Based on Normalized Virtual Softmax Loss
by: Wenzuo Qiao, et al.
Published: (2022-05-01)

Integrating Enhanced Sparse Autoencoder-Based Artificial Neural Network Technique and Softmax Regression for Medical Diagnosis
by: Sarah A. Ebiaredoh-Mienye, et al.
Published: (2020-11-01)

Deep Classification with Linearity-Enhanced Logits to Softmax Function
by: Hao Shao, et al.
Published: (2023-04-01)

A Partially Interpretable Adaptive Softmax Regression for Credit Scoring
by: Lkhagvadorj Munkhdalai, et al.
Published: (2021-04-01)

Reinforcement Procedure for Randomized Machine Learning
by: Yuri S. Popkov, et al.
Published: (2023-08-01)

Explaining the Behaviour of Reinforcement Learning Agents in a Multi-Agent Cooperative Environment Using Policy Graphs
by: Marc Domenech i Vila, et al.
Published: (2024-01-01)

Q-Learning Algorithms: A Comprehensive Classification and Applications
by: Beakcheol Jang, et al.
Published: (2019-01-01)

An Efficient Centralized Multi-Agent Reinforcement Learner for Cooperative Tasks
by: Dengyu Liao, et al.
Published: (2023-01-01)

DADE-DQN: Dual Action and Dual Environment Deep Q-Network for Enhancing Stock Trading Strategy
by: Yuling Huang, et al.
Published: (2023-08-01)

Joint Shaping of Geometry and Probability based on Mutual Information Neural Estimation
by: Jia-xi LIANG, et al.
Published: (2022-06-01)

Mobilized ad-hoc networks: A reinforcement learning approach
by: Chang, Yu-Han, et al.
Published: (2005)

Mobilized ad-hoc networks: A reinforcement learning approach
by: Chang, Yu-Han, et al.
Published: (2004)

Powered Landing Control of Reusable Rockets Based on Softmax Double DDPG
by: Wenting Li, et al.
Published: (2023-06-01)

Optimal feedback control of dynamical systems via value-function approximation
by: Kunisch, Karl, et al.
Published: (2023-07-01)

Attentional Factorized Q-Learning for Many-Agent Learning
by: Xiaoqiang Wang, et al.
Published: (2022-01-01)

Greedy Action Selection and Pessimistic Q-Value Updating in Multi-Agent Reinforcement Learning with Sparse Interaction
by: Toshihiro Kujirai, et al.
Published: (2019-05-01)

Pneumonia Detection on X-Ray Imaging using Softmax Output in Multilevel Meta Ensemble Algorithm of Deep Convolutional Neural Network Transfer Learning Models
by: Simeon Yuda Prasetyo, et al.
Published: (2023-07-01)

A Sparse Autoencoder and Softmax Regression Based Diagnosis Method for the Attachment on the Blades of Marine Current Turbine
by: Yilai Zheng, et al.
Published: (2019-02-01)

Optimal Deception Asset Deployment in Cybersecurity: A Nash Q-Learning Approach in Multi-Agent Stochastic Games
by: Guanhua Kong, et al.
Published: (2023-12-01)

Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms
by: Abdikarim Mohamed Ibrahim, et al.
Published: (2021-11-01)

A robust estimator of mutual information for deep learning interpretability
by: Davide Piras, et al.
Published: (2023-01-01)

Target-Network Update Linked with Learning Rate Decay Based on Mutual Information and Reward in Deep Reinforcement Learning
by: Chayoung Kim
Published: (2023-09-01)

Q-learning-based migration leading to spontaneous emergence of segregation
by: Zhixue He, et al.
Published: (2022-01-01)

Anticheat System Based on Reinforcement Learning Agents in Unity
by: Mihael Lukas, et al.
Published: (2022-03-01)

An Optimization Method for Collaborative Radar Antijamming Based on Multi-Agent Reinforcement Learning
by: Cheng Feng, et al.
Published: (2023-06-01)

Importance Sampling for Reinforcement Learning with Multiple Objectives
by: Shelton, Christian Robert
Published: (2004)

Objective Weight Interval Estimation Using Adversarial Inverse Reinforcement Learning
by: Naoya Takayama, et al.
Published: (2023-01-01)

An SDN Controller-Based Network Slicing Scheme Using Constrained Reinforcement Learning
by: Mduduzi C. Mduduzi Hlophe, et al.
Published: (2022-01-01)