Offline Reward Learning from Human Demonstrations and Feedback: A Linear Programming Approach

In many complex sequential decision-making tasks, there is often no known explicit reward function, and the only information available is human demonstrations and feedback data. To infer and shape the underlying reward function from this data, two key methodologies have emerged: inverse reinforcemen...

Full description

Bibliographic Details
Main Author: Kim, Kihyun
Other Authors: Ozdaglar, Asuman
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156337