Adherence Improves Cooperation in Sequential Social Dilemmas

Social dilemmas have guided research on mutual cooperation for decades, especially the two-person social dilemma. Most famously, Tit-for-Tat performs very well in tournaments of the Prisoner’s Dilemma. Nevertheless, they treat the options to cooperate or defect only as an atomic action, which cannot...

Full description

Bibliographic Details
Main Authors:	Yuyu Yuan, Ting Guo, Pengqian Zhao, Hongpu Jiang
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Applied Sciences
Subjects:	multi-agent reinforcement learning multi-agent system intrinsic reward counterfactual reasoning sequential social dilemmas
Online Access:	https://www.mdpi.com/2076-3417/12/16/8004

_version_	1827600931091382272
author	Yuyu Yuan Ting Guo Pengqian Zhao Hongpu Jiang
author_facet	Yuyu Yuan Ting Guo Pengqian Zhao Hongpu Jiang
author_sort	Yuyu Yuan
collection	DOAJ
description	Social dilemmas have guided research on mutual cooperation for decades, especially the two-person social dilemma. Most famously, Tit-for-Tat performs very well in tournaments of the Prisoner’s Dilemma. Nevertheless, they treat the options to cooperate or defect only as an atomic action, which cannot satisfy the complexity of the real world. In recent research, these options to cooperate or defect were temporally extended. Here, we propose a novel adherence-based multi-agent reinforcement learning algorithm for achieving cooperation and coordination by rewarding agents who adhere to other agents. The evaluation of adherence is based on counterfactual reasoning. During training, each agent observes the changes in the actions of other agents by replacing its current action, thereby calculating the degree of adherence of other agents to its behavior. Using adherence as an intrinsic reward enables agents to consider the collective, thus promoting cooperation. In addition, the adherence rewards of all agents are calculated in a decentralized way. We experiment in sequential social dilemma environments, and the results demonstrate the potential for the algorithm to enhance cooperation and coordination and significantly increase the scores of the deep RL agents.
first_indexed	2024-03-09T04:45:06Z
format	Article
id	doaj.art-8cb4314113e549828269f3e0dcf18a81
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T04:45:06Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-8cb4314113e549828269f3e0dcf18a812023-12-03T13:16:41ZengMDPI AGApplied Sciences2076-34172022-08-011216800410.3390/app12168004Adherence Improves Cooperation in Sequential Social DilemmasYuyu Yuan0Ting Guo1Pengqian Zhao2Hongpu Jiang3School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSchool of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, ChinaSocial dilemmas have guided research on mutual cooperation for decades, especially the two-person social dilemma. Most famously, Tit-for-Tat performs very well in tournaments of the Prisoner’s Dilemma. Nevertheless, they treat the options to cooperate or defect only as an atomic action, which cannot satisfy the complexity of the real world. In recent research, these options to cooperate or defect were temporally extended. Here, we propose a novel adherence-based multi-agent reinforcement learning algorithm for achieving cooperation and coordination by rewarding agents who adhere to other agents. The evaluation of adherence is based on counterfactual reasoning. During training, each agent observes the changes in the actions of other agents by replacing its current action, thereby calculating the degree of adherence of other agents to its behavior. Using adherence as an intrinsic reward enables agents to consider the collective, thus promoting cooperation. In addition, the adherence rewards of all agents are calculated in a decentralized way. We experiment in sequential social dilemma environments, and the results demonstrate the potential for the algorithm to enhance cooperation and coordination and significantly increase the scores of the deep RL agents.https://www.mdpi.com/2076-3417/12/16/8004multi-agent reinforcement learningmulti-agent systemintrinsic rewardcounterfactual reasoningsequential social dilemmas
spellingShingle	Yuyu Yuan Ting Guo Pengqian Zhao Hongpu Jiang Adherence Improves Cooperation in Sequential Social Dilemmas Applied Sciences multi-agent reinforcement learning multi-agent system intrinsic reward counterfactual reasoning sequential social dilemmas
title	Adherence Improves Cooperation in Sequential Social Dilemmas
title_full	Adherence Improves Cooperation in Sequential Social Dilemmas
title_fullStr	Adherence Improves Cooperation in Sequential Social Dilemmas
title_full_unstemmed	Adherence Improves Cooperation in Sequential Social Dilemmas
title_short	Adherence Improves Cooperation in Sequential Social Dilemmas
title_sort	adherence improves cooperation in sequential social dilemmas
topic	multi-agent reinforcement learning multi-agent system intrinsic reward counterfactual reasoning sequential social dilemmas
url	https://www.mdpi.com/2076-3417/12/16/8004
work_keys_str_mv	AT yuyuyuan adherenceimprovescooperationinsequentialsocialdilemmas AT tingguo adherenceimprovescooperationinsequentialsocialdilemmas AT pengqianzhao adherenceimprovescooperationinsequentialsocialdilemmas AT hongpujiang adherenceimprovescooperationinsequentialsocialdilemmas

Adherence Improves Cooperation in Sequential Social Dilemmas

Similar Items