Social Interactions as Recursive MDPs

While machines and robots must interact with humans, providing them with social skills has been a largely overlooked topic. This is mostly a consequence of the fact that tasks such as navigation, command following, and even game playing are well-defined, while social reasoning still mostly re- mains...

Full description

Bibliographic Details
Main Authors:	Tejwani, Ravi, Kuo, Yen-Ling, Shu, Tianmin, Katz, Boris, Barbu, Andrei
Format:	Article
Published:	Center for Brains, Minds and Machines (CBMM), Conference on Robot Learning (CoRL) 2022
Online Access:	https://hdl.handle.net/1721.1/141360

_version_	1811080761276628992
author	Tejwani, Ravi Kuo, Yen-Ling Shu, Tianmin Katz, Boris Barbu, Andrei
author_facet	Tejwani, Ravi Kuo, Yen-Ling Shu, Tianmin Katz, Boris Barbu, Andrei
author_sort	Tejwani, Ravi
collection	MIT
description	While machines and robots must interact with humans, providing them with social skills has been a largely overlooked topic. This is mostly a consequence of the fact that tasks such as navigation, command following, and even game playing are well-defined, while social reasoning still mostly re- mains a pre-theoretic problem. We demonstrate how social interactions can be effectively incorporated into MDPs (Markov decision processes) by reasoning recursively about the goals of other agents. In essence, our method extends the reward function to include a combination of physical goals (something agents want to accomplish in the configuration space, a traditional MDP) and social goals (something agents want to accomplish relative to the goals of other agents). Our Social MDPs allow specifying reward functions in terms of the estimated reward functions of other agents, modeling interactions such as helping or hindering another agent (by maximizing or minimizing the other agent’s reward) while bal- ancing this with the actual physical goals of each agent. Our formulation allows for an arbitrary function of another agent’s estimated reward structure and physical goals, enabling more complex behaviors such as politely hindering another agent or aggressively helping them. Extending Social MDPs in the same manner as I-POMDPs (Interactive-partially observed Markov decision processes) extension would enable interactions such as convincing another agent that something is true. To what extent the Social MDPs presented here and their potential Social POMDPs variant account for all possible social interactions is unknown, but having a precise mathematical model to guide questions about social in- teractions has both practical value (we demonstrate how to make zero-shot social inferences and one could imagine chatbots and robots guided by Social MDPs) and theoretical value by bringing the tools of MDP that have so successfully organized research around navigation to shed light on what social interactions really are given their extreme importance to human well-being and human civilization.
first_indexed	2024-09-23T11:36:25Z
format	Article
id	mit-1721.1/141360
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T11:36:25Z
publishDate	2022
publisher	Center for Brains, Minds and Machines (CBMM), Conference on Robot Learning (CoRL)
record_format	dspace
spelling	mit-1721.1/1413602022-03-25T03:24:06Z Social Interactions as Recursive MDPs Tejwani, Ravi Kuo, Yen-Ling Shu, Tianmin Katz, Boris Barbu, Andrei While machines and robots must interact with humans, providing them with social skills has been a largely overlooked topic. This is mostly a consequence of the fact that tasks such as navigation, command following, and even game playing are well-defined, while social reasoning still mostly re- mains a pre-theoretic problem. We demonstrate how social interactions can be effectively incorporated into MDPs (Markov decision processes) by reasoning recursively about the goals of other agents. In essence, our method extends the reward function to include a combination of physical goals (something agents want to accomplish in the configuration space, a traditional MDP) and social goals (something agents want to accomplish relative to the goals of other agents). Our Social MDPs allow specifying reward functions in terms of the estimated reward functions of other agents, modeling interactions such as helping or hindering another agent (by maximizing or minimizing the other agent’s reward) while bal- ancing this with the actual physical goals of each agent. Our formulation allows for an arbitrary function of another agent’s estimated reward structure and physical goals, enabling more complex behaviors such as politely hindering another agent or aggressively helping them. Extending Social MDPs in the same manner as I-POMDPs (Interactive-partially observed Markov decision processes) extension would enable interactions such as convincing another agent that something is true. To what extent the Social MDPs presented here and their potential Social POMDPs variant account for all possible social interactions is unknown, but having a precise mathematical model to guide questions about social in- teractions has both practical value (we demonstrate how to make zero-shot social inferences and one could imagine chatbots and robots guided by Social MDPs) and theoretical value by bringing the tools of MDP that have so successfully organized research around navigation to shed light on what social interactions really are given their extreme importance to human well-being and human civilization. This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF – 1231216. 2022-03-24T17:17:57Z 2022-03-24T17:17:57Z 2021-11-08 Article Technical Report Working Paper https://hdl.handle.net/1721.1/141360 CBMM Memo;130 application/pdf Center for Brains, Minds and Machines (CBMM), Conference on Robot Learning (CoRL)
spellingShingle	Tejwani, Ravi Kuo, Yen-Ling Shu, Tianmin Katz, Boris Barbu, Andrei Social Interactions as Recursive MDPs
title	Social Interactions as Recursive MDPs
title_full	Social Interactions as Recursive MDPs
title_fullStr	Social Interactions as Recursive MDPs
title_full_unstemmed	Social Interactions as Recursive MDPs
title_short	Social Interactions as Recursive MDPs
title_sort	social interactions as recursive mdps
url	https://hdl.handle.net/1721.1/141360
work_keys_str_mv	AT tejwaniravi socialinteractionsasrecursivemdps AT kuoyenling socialinteractionsasrecursivemdps AT shutianmin socialinteractionsasrecursivemdps AT katzboris socialinteractionsasrecursivemdps AT barbuandrei socialinteractionsasrecursivemdps

Social Interactions as Recursive MDPs

Similar Items