Efficiently Detecting Non-Stationary Opponents: A Bayesian Policy Reuse Approach under Partial Observability

In multi-agent domains, dealing with non-stationary opponents that change behaviors (policies) consistently over time is still a challenging problem, where an agent usually requires the ability to detect the opponent’s policy accurately and adopt the optimal response policy accordingly. Previous wor...

Full description

Bibliographic Details
Main Authors: Yu Wang, Ke Fu, Hao Chen, Quan Liu, Jian Huang, Zhongjie Zhang
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/14/6953