Improving single and multi-agent deep reinforcement learning methods

<p>Reinforcement Learning (RL) is a framework where an agent learns to make decisions using data-driven feedback from interactions with the environment in the form of rewards or penalties for actions. Deep RL integrates deep learning with RL, harnessing the power of deep neural networks to pro...

Full description

Bibliographic Details
Main Author: Gupta, T
Other Authors: Whiteson, S
Format: Thesis
Language:English
Published: 2023
Subjects:
_version_ 1826314893290110976
author Gupta, T
author2 Whiteson, S
author_facet Whiteson, S
Gupta, T
author_sort Gupta, T
collection OXFORD
description <p>Reinforcement Learning (RL) is a framework where an agent learns to make decisions using data-driven feedback from interactions with the environment in the form of rewards or penalties for actions. Deep RL integrates deep learning with RL, harnessing the power of deep neural networks to process complex, high-dimensional data. Using the framework of deep RL, our machine learning research community has achieved tremendous progress in enabling machines to make sequential decisions over long time horizons. These advances include attaining super-human performance in Atari [Mnih et al., 2015], mastering the game of Go, beating the human world champion [Silver et al., 2017], providing robust recommendation systems [GomezUribe and Hunt, 2015, Singh et al., 2021]. This thesis focuses on identifying some key challenges that impede the learning of RL agents within their specific environments and improving the methods leading to better performance of agents, improved sample efficiency, and generalizability of learned agent policies.</p> <p>In Part I of the thesis, we focus on exploration in single-agent RL settings where an agent must interact with a complex environment to pursue a goal. An agent that fails to explore its environment is unlikely to achieve high performance, as it will miss critical rewards and, as a result, cannot learn optimal behavior. One key challenge arises from sparse reward environments where the agent only receives feedback once the task is completed, making exploration more challenging. We propose a novel method that enables semantic exploration, resulting in higher sample efficiency and performance on sparse reward tasks.</p> <p>In Part II of the thesis, we focus on cooperative Multi-Agent Reinforcement Learning (MARL), an extension of the usual RL setting, where we consider multiple agents interacting in the same environment toward a shared task. In multi-agent tasks requiring significant coordination among agents with strict penalties for miscoordination, state-of-the-art MARL methods often fail to learn useful behaviors as agents get stuck in a sub-optimal equilibrium. Another challenge is exploration in the joint action space of all agents, which grows exponentially with the number of agents. To address these challenges, we propose innovative approaches like universal value exploration and scalable role-based learning. These methods facilitate improved coordination among agents, faster exploration, and enhance the agents’ ability to adapt to new environments and tasks, showcasing zero-shot generalization capabilities and resulting in higher sample efficiency. Lastly, we investigate independent policybased methods in cooperative MARL, where each agent considers other agents as part of the environment. We show that such methods can perform better than state-of-the-art joint learning approaches on a popular multi-agent benchmark.</p> <p>In summary, the contributions of this thesis significantly improve the stateof-the-art in deep (multi agent) reinforcement learning. The agents developed in his thesis can explore their environments efficiently to improve sample efficiency, learn tasks that require significant multi-agent coordination, and enable zero-shot generalization across various tasks.</p>
first_indexed 2024-12-09T03:15:55Z
format Thesis
id oxford-uuid:2ee45333-e42c-440c-bfe9-3430beeb653c
institution University of Oxford
language English
last_indexed 2024-12-09T03:15:55Z
publishDate 2023
record_format dspace
spelling oxford-uuid:2ee45333-e42c-440c-bfe9-3430beeb653c2024-10-21T09:21:23ZImproving single and multi-agent deep reinforcement learning methodsThesishttp://purl.org/coar/resource_type/c_db06uuid:2ee45333-e42c-440c-bfe9-3430beeb653cMachine learningEnglishHyrax Deposit2023Gupta, TWhiteson, S<p>Reinforcement Learning (RL) is a framework where an agent learns to make decisions using data-driven feedback from interactions with the environment in the form of rewards or penalties for actions. Deep RL integrates deep learning with RL, harnessing the power of deep neural networks to process complex, high-dimensional data. Using the framework of deep RL, our machine learning research community has achieved tremendous progress in enabling machines to make sequential decisions over long time horizons. These advances include attaining super-human performance in Atari [Mnih et al., 2015], mastering the game of Go, beating the human world champion [Silver et al., 2017], providing robust recommendation systems [GomezUribe and Hunt, 2015, Singh et al., 2021]. This thesis focuses on identifying some key challenges that impede the learning of RL agents within their specific environments and improving the methods leading to better performance of agents, improved sample efficiency, and generalizability of learned agent policies.</p> <p>In Part I of the thesis, we focus on exploration in single-agent RL settings where an agent must interact with a complex environment to pursue a goal. An agent that fails to explore its environment is unlikely to achieve high performance, as it will miss critical rewards and, as a result, cannot learn optimal behavior. One key challenge arises from sparse reward environments where the agent only receives feedback once the task is completed, making exploration more challenging. We propose a novel method that enables semantic exploration, resulting in higher sample efficiency and performance on sparse reward tasks.</p> <p>In Part II of the thesis, we focus on cooperative Multi-Agent Reinforcement Learning (MARL), an extension of the usual RL setting, where we consider multiple agents interacting in the same environment toward a shared task. In multi-agent tasks requiring significant coordination among agents with strict penalties for miscoordination, state-of-the-art MARL methods often fail to learn useful behaviors as agents get stuck in a sub-optimal equilibrium. Another challenge is exploration in the joint action space of all agents, which grows exponentially with the number of agents. To address these challenges, we propose innovative approaches like universal value exploration and scalable role-based learning. These methods facilitate improved coordination among agents, faster exploration, and enhance the agents’ ability to adapt to new environments and tasks, showcasing zero-shot generalization capabilities and resulting in higher sample efficiency. Lastly, we investigate independent policybased methods in cooperative MARL, where each agent considers other agents as part of the environment. We show that such methods can perform better than state-of-the-art joint learning approaches on a popular multi-agent benchmark.</p> <p>In summary, the contributions of this thesis significantly improve the stateof-the-art in deep (multi agent) reinforcement learning. The agents developed in his thesis can explore their environments efficiently to improve sample efficiency, learn tasks that require significant multi-agent coordination, and enable zero-shot generalization across various tasks.</p>
spellingShingle Machine learning
Gupta, T
Improving single and multi-agent deep reinforcement learning methods
title Improving single and multi-agent deep reinforcement learning methods
title_full Improving single and multi-agent deep reinforcement learning methods
title_fullStr Improving single and multi-agent deep reinforcement learning methods
title_full_unstemmed Improving single and multi-agent deep reinforcement learning methods
title_short Improving single and multi-agent deep reinforcement learning methods
title_sort improving single and multi agent deep reinforcement learning methods
topic Machine learning
work_keys_str_mv AT guptat improvingsingleandmultiagentdeepreinforcementlearningmethods