VariBAD: variational bayes-adaptive deep RL via meta-learning

Trading off exploration and exploitation in an unknown environment is key to maximising expected online return during learning. A Bayes-optimal policy, which does so optimally, conditions its actions not only on the environment state but also on the agent's uncertainty about the environment. Co...

पूर्ण विवरण

ग्रंथसूची विवरण
मुख्य लेखक:	Whiteson, S
स्वरूप:	Journal article
भाषा:	English
प्रकाशित:	Journal of Machine Learning Research 2021

VariBAD: variational bayes-adaptive deep RL via meta-learning

समान संसाधन