Off-Policy Meta-Reinforcement Learning With Belief-Based Task Inference
Meta-reinforcement learning (RL) addresses the problem of sample inefficiency in deep RL by using experience obtained in past tasks for solving a new task. However, most existing meta-RL methods require partially or fully on-policy data, which hinders the improvement of sample efficiency. To allevia...
Auteurs principaux: | , , |
---|---|
Format: | Article |
Langue: | English |
Publié: |
IEEE
2022-01-01
|
Collection: | IEEE Access |
Sujets: | |
Accès en ligne: | https://ieeexplore.ieee.org/document/9763505/ |