Exploratory State Representation Learning
Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2022-02-01
|
Series: | Frontiers in Robotics and AI |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/frobt.2022.762051/full |
_version_ | 1819279074878554112 |
---|---|
author | Astrid Merckling Nicolas Perrin-Gilbert Alex Coninx Stéphane Doncieux |
author_facet | Astrid Merckling Nicolas Perrin-Gilbert Alex Coninx Stéphane Doncieux |
author_sort | Astrid Merckling |
collection | DOAJ |
description | Not having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a k-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks. |
first_indexed | 2024-12-24T00:22:08Z |
format | Article |
id | doaj.art-aacfc343d3f9470b9e5d410a29c9c31e |
institution | Directory Open Access Journal |
issn | 2296-9144 |
language | English |
last_indexed | 2024-12-24T00:22:08Z |
publishDate | 2022-02-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Robotics and AI |
spelling | doaj.art-aacfc343d3f9470b9e5d410a29c9c31e2022-12-21T17:24:34ZengFrontiers Media S.A.Frontiers in Robotics and AI2296-91442022-02-01910.3389/frobt.2022.762051762051Exploratory State Representation LearningAstrid MercklingNicolas Perrin-GilbertAlex ConinxStéphane DoncieuxNot having access to compact and meaningful representations is known to significantly increase the complexity of reinforcement learning (RL). For this reason, it can be useful to perform state representation learning (SRL) before tackling RL tasks. However, obtaining a good state representation can only be done if a large diversity of transitions is observed, which can require a difficult exploration, especially if the environment is initially reward-free. To solve the problems of exploration and SRL in parallel, we propose a new approach called XSRL (eXploratory State Representation Learning). On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a k-step learning progress bonus to form the maximization objective of a discovery policy. This results in a policy that seeks complex transitions from which the trained models can effectively learn. Our experimental results show that the approach leads to efficient exploration in challenging environments with image observations, and to state representations that significantly accelerate learning in RL tasks.https://www.frontiersin.org/articles/10.3389/frobt.2022.762051/fullstate representation learningpretrainingexplorationunsupervised learningdeep reinforcement learning |
spellingShingle | Astrid Merckling Nicolas Perrin-Gilbert Alex Coninx Stéphane Doncieux Exploratory State Representation Learning Frontiers in Robotics and AI state representation learning pretraining exploration unsupervised learning deep reinforcement learning |
title | Exploratory State Representation Learning |
title_full | Exploratory State Representation Learning |
title_fullStr | Exploratory State Representation Learning |
title_full_unstemmed | Exploratory State Representation Learning |
title_short | Exploratory State Representation Learning |
title_sort | exploratory state representation learning |
topic | state representation learning pretraining exploration unsupervised learning deep reinforcement learning |
url | https://www.frontiersin.org/articles/10.3389/frobt.2022.762051/full |
work_keys_str_mv | AT astridmerckling exploratorystaterepresentationlearning AT nicolasperringilbert exploratorystaterepresentationlearning AT alexconinx exploratorystaterepresentationlearning AT stephanedoncieux exploratorystaterepresentationlearning |