Balancing Teacher Following and Reward Maximization in Reinforcement Learning

Learning from rewards (i.e., reinforcement learning or RL) and learning to imitate a teacher (i.e., teacher-student learning) are two established approaches for solving sequential decision-making problems. To combine the benefits of these different forms of learning, it is common to train a policy t...

Full description

Bibliographic Details
Main Author: Shenfeld Amit, Idan
Other Authors: Agrawal, Pulkit
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156290