The Missing Link Between Memory and Reinforcement Learning

Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and anim...

Full description

Bibliographic Details
Main Authors: Christian Balkenius, Trond A. Tjøstheim, Birger Johansson, Annika Wallin, Peter Gärdenfors
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-12-01
Series:Frontiers in Psychology
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fpsyg.2020.560080/full
_version_ 1818730632492089344
author Christian Balkenius
Trond A. Tjøstheim
Birger Johansson
Annika Wallin
Peter Gärdenfors
Peter Gärdenfors
author_facet Christian Balkenius
Trond A. Tjøstheim
Birger Johansson
Annika Wallin
Peter Gärdenfors
Peter Gärdenfors
author_sort Christian Balkenius
collection DOAJ
description Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the “missing link” between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.
first_indexed 2024-12-17T23:04:52Z
format Article
id doaj.art-cc3c2b52287c4768b268dbf392e01b90
institution Directory Open Access Journal
issn 1664-1078
language English
last_indexed 2024-12-17T23:04:52Z
publishDate 2020-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Psychology
spelling doaj.art-cc3c2b52287c4768b268dbf392e01b902022-12-21T21:29:18ZengFrontiers Media S.A.Frontiers in Psychology1664-10782020-12-011110.3389/fpsyg.2020.560080560080The Missing Link Between Memory and Reinforcement LearningChristian Balkenius0Trond A. Tjøstheim1Birger Johansson2Annika Wallin3Peter Gärdenfors4Peter Gärdenfors5Lund University Cognitive Science, Lund University, Lund, SwedenLund University Cognitive Science, Lund University, Lund, SwedenLund University Cognitive Science, Lund University, Lund, SwedenLund University Cognitive Science, Lund University, Lund, SwedenLund University Cognitive Science, Lund University, Lund, SwedenPalaeo-Research Institute, University of Johannesburg, Johannesburg, South AfricaReinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the “missing link” between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.https://www.frontiersin.org/articles/10.3389/fpsyg.2020.560080/fullmemory modeldecision makingaccumulator modelepisodic memorysemantic memory
spellingShingle Christian Balkenius
Trond A. Tjøstheim
Birger Johansson
Annika Wallin
Peter Gärdenfors
Peter Gärdenfors
The Missing Link Between Memory and Reinforcement Learning
Frontiers in Psychology
memory model
decision making
accumulator model
episodic memory
semantic memory
title The Missing Link Between Memory and Reinforcement Learning
title_full The Missing Link Between Memory and Reinforcement Learning
title_fullStr The Missing Link Between Memory and Reinforcement Learning
title_full_unstemmed The Missing Link Between Memory and Reinforcement Learning
title_short The Missing Link Between Memory and Reinforcement Learning
title_sort missing link between memory and reinforcement learning
topic memory model
decision making
accumulator model
episodic memory
semantic memory
url https://www.frontiersin.org/articles/10.3389/fpsyg.2020.560080/full
work_keys_str_mv AT christianbalkenius themissinglinkbetweenmemoryandreinforcementlearning
AT trondatjøstheim themissinglinkbetweenmemoryandreinforcementlearning
AT birgerjohansson themissinglinkbetweenmemoryandreinforcementlearning
AT annikawallin themissinglinkbetweenmemoryandreinforcementlearning
AT petergardenfors themissinglinkbetweenmemoryandreinforcementlearning
AT petergardenfors themissinglinkbetweenmemoryandreinforcementlearning
AT christianbalkenius missinglinkbetweenmemoryandreinforcementlearning
AT trondatjøstheim missinglinkbetweenmemoryandreinforcementlearning
AT birgerjohansson missinglinkbetweenmemoryandreinforcementlearning
AT annikawallin missinglinkbetweenmemoryandreinforcementlearning
AT petergardenfors missinglinkbetweenmemoryandreinforcementlearning
AT petergardenfors missinglinkbetweenmemoryandreinforcementlearning