Value propagation networks

We present Value Propagation (VProp), a parameter-efficient differentiable planning module built on Value Iteration which can successfully be trained in a reinforcement learning fashion to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic...

Täydet tiedot

Bibliografiset tiedot
Päätekijät:	Nardelli, N, Synnaeve, G, Lin, Z, Kohli, P, Torr, PHS, Usunier, N
Aineistotyyppi:	Conference item
Kieli:	English
Julkaistu:	OpenReview 2019

Value propagation networks

Samankaltaisia teoksia