Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition

Bibliographic Details
Main Authors: Jin, Chi, Jin, Tiancheng, Luo, Haipeng, Sra, Suvrit, Yu, Tiancheng
Other Authors: Massachusetts Institute of Technology. Institute for Data, Systems, and Society
Format: Article
Language:English
Published: 2022
Online Access:https://hdl.handle.net/1721.1/143895