Value propagation networks

We present Value Propagation (VProp), a parameter-efficient differentiable planning module built on Value Iteration which can successfully be trained in a reinforcement learning fashion to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic...

Täydet tiedot

Bibliografiset tiedot
Päätekijät: Nardelli, N, Synnaeve, G, Lin, Z, Kohli, P, Torr, PHS, Usunier, N
Aineistotyyppi: Conference item
Kieli:English
Julkaistu: OpenReview 2019