Dynamic-depth context tree weighting

Reinforcement learning (RL) in partially observable settings is challenging be- cause the agent’s immediate observations are not Markov. Recently proposed methods can learn variable-order Markov models of the underlying process but have steep memory requirements and are sensitive to aliasing betw...

Бүрэн тодорхойлолт

Номзүйн дэлгэрэнгүй
Үндсэн зохиолчид:	Messias, J, Whiteson, S
Формат:	Conference item
Хэвлэсэн:	Curran Associates 2018

Dynamic-depth context tree weighting

Ижил төстэй зүйлс