A Survey on Population-Based Deep Reinforcement Learning
Many real-world applications can be described as large-scale games of imperfect information, which require extensive prior domain knowledge, especially in competitive or human–AI cooperation settings. Population-based training methods have become a popular solution to learn robust policies without a...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/11/10/2234 |
_version_ | 1797599187287670784 |
---|---|
author | Weifan Long Taixian Hou Xiaoyi Wei Shichao Yan Peng Zhai Lihua Zhang |
author_facet | Weifan Long Taixian Hou Xiaoyi Wei Shichao Yan Peng Zhai Lihua Zhang |
author_sort | Weifan Long |
collection | DOAJ |
description | Many real-world applications can be described as large-scale games of imperfect information, which require extensive prior domain knowledge, especially in competitive or human–AI cooperation settings. Population-based training methods have become a popular solution to learn robust policies without any prior knowledge, which can generalize to policies of other players or humans. In this survey, we shed light on population-based deep reinforcement learning (PB-DRL) algorithms, their applications, and general frameworks. We introduce several independent subject areas, including naive self-play, fictitious self-play, population-play, evolution-based training methods, and the policy-space response oracle family. These methods provide a variety of approaches to solving multi-agent problems and are useful in designing robust multi-agent reinforcement learning algorithms that can handle complex real-life situations. Finally, we discuss challenges and hot topics in PB-DRL algorithms. We hope that this brief survey can provide guidance and insights for researchers interested in PB-DRL algorithms. |
first_indexed | 2024-03-11T03:32:12Z |
format | Article |
id | doaj.art-4ab3ae5333f0433a8a844b08a024d413 |
institution | Directory Open Access Journal |
issn | 2227-7390 |
language | English |
last_indexed | 2024-03-11T03:32:12Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj.art-4ab3ae5333f0433a8a844b08a024d4132023-11-18T02:17:59ZengMDPI AGMathematics2227-73902023-05-011110223410.3390/math11102234A Survey on Population-Based Deep Reinforcement LearningWeifan Long0Taixian Hou1Xiaoyi Wei2Shichao Yan3Peng Zhai4Lihua Zhang5Academy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaAcademy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaAcademy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaAcademy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaAcademy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaAcademy for Engineering and Technology, Fudan University, Shanghai 200433, ChinaMany real-world applications can be described as large-scale games of imperfect information, which require extensive prior domain knowledge, especially in competitive or human–AI cooperation settings. Population-based training methods have become a popular solution to learn robust policies without any prior knowledge, which can generalize to policies of other players or humans. In this survey, we shed light on population-based deep reinforcement learning (PB-DRL) algorithms, their applications, and general frameworks. We introduce several independent subject areas, including naive self-play, fictitious self-play, population-play, evolution-based training methods, and the policy-space response oracle family. These methods provide a variety of approaches to solving multi-agent problems and are useful in designing robust multi-agent reinforcement learning algorithms that can handle complex real-life situations. Finally, we discuss challenges and hot topics in PB-DRL algorithms. We hope that this brief survey can provide guidance and insights for researchers interested in PB-DRL algorithms.https://www.mdpi.com/2227-7390/11/10/2234reinforcement learningmulti-agent reinforcement learningself playpopulation play |
spellingShingle | Weifan Long Taixian Hou Xiaoyi Wei Shichao Yan Peng Zhai Lihua Zhang A Survey on Population-Based Deep Reinforcement Learning Mathematics reinforcement learning multi-agent reinforcement learning self play population play |
title | A Survey on Population-Based Deep Reinforcement Learning |
title_full | A Survey on Population-Based Deep Reinforcement Learning |
title_fullStr | A Survey on Population-Based Deep Reinforcement Learning |
title_full_unstemmed | A Survey on Population-Based Deep Reinforcement Learning |
title_short | A Survey on Population-Based Deep Reinforcement Learning |
title_sort | survey on population based deep reinforcement learning |
topic | reinforcement learning multi-agent reinforcement learning self play population play |
url | https://www.mdpi.com/2227-7390/11/10/2234 |
work_keys_str_mv | AT weifanlong asurveyonpopulationbaseddeepreinforcementlearning AT taixianhou asurveyonpopulationbaseddeepreinforcementlearning AT xiaoyiwei asurveyonpopulationbaseddeepreinforcementlearning AT shichaoyan asurveyonpopulationbaseddeepreinforcementlearning AT pengzhai asurveyonpopulationbaseddeepreinforcementlearning AT lihuazhang asurveyonpopulationbaseddeepreinforcementlearning AT weifanlong surveyonpopulationbaseddeepreinforcementlearning AT taixianhou surveyonpopulationbaseddeepreinforcementlearning AT xiaoyiwei surveyonpopulationbaseddeepreinforcementlearning AT shichaoyan surveyonpopulationbaseddeepreinforcementlearning AT pengzhai surveyonpopulationbaseddeepreinforcementlearning AT lihuazhang surveyonpopulationbaseddeepreinforcementlearning |