Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events

Learning is a crucial basis for biological systems to adapt to environments. Environments include various states or episodes, and episode-dependent learning is essential in adaptation to such complex situations. Here, we developed a model for learning a two-target search task used in primate physiol...

Full description

Bibliographic Details
Main Authors:	Kazuhiro Sakamoto, Hinata Yamada, Norihiko Kawaguchi, Yoshito Furusawa, Naohiro Saito, Hajime Mushiake
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2022-06-01
Series:	Frontiers in Computational Neuroscience
Subjects:	reinforcement learning target search task dynamic state space episode-dependent learning history-in-episode architecture
Online Access:	https://www.frontiersin.org/articles/10.3389/fncom.2022.784604/full

_version_	1818554283623186432
author	Kazuhiro Sakamoto Kazuhiro Sakamoto Hinata Yamada Norihiko Kawaguchi Yoshito Furusawa Naohiro Saito Hajime Mushiake
author_facet	Kazuhiro Sakamoto Kazuhiro Sakamoto Hinata Yamada Norihiko Kawaguchi Yoshito Furusawa Naohiro Saito Hajime Mushiake
author_sort	Kazuhiro Sakamoto
collection	DOAJ
description	Learning is a crucial basis for biological systems to adapt to environments. Environments include various states or episodes, and episode-dependent learning is essential in adaptation to such complex situations. Here, we developed a model for learning a two-target search task used in primate physiological experiments. In the task, the agent is required to gaze one of the four presented light spots. Two neighboring spots are served as the correct target alternately, and the correct target pair is switched after a certain number of consecutive successes. In order for the agent to obtain rewards with a high probability, it is necessary to make decisions based on the actions and results of the previous two trials. Our previous work achieved this by using a dynamic state space. However, to learn a task that includes events such as fixation to the initial central spot, the model framework should be extended. For this purpose, here we propose a “history-in-episode architecture.” Specifically, we divide states into episodes and histories, and actions are selected based on the histories within each episode. When we compared the proposed model including the dynamic state space with the conventional SARSA method in the two-target search task, the former performed close to the theoretical optimum, while the latter never achieved target-pair switch because it had to re-learn each correct target each time. The reinforcement learning model including the proposed history-in-episode architecture and dynamic state scape enables episode-dependent learning and provides a basis for highly adaptable learning systems to complex environments.
first_indexed	2024-12-12T09:37:20Z
format	Article
id	doaj.art-c2ac593b9b344bbe8a4b241a214e663c
institution	Directory Open Access Journal
issn	1662-5188
language	English
last_indexed	2024-12-12T09:37:20Z
publishDate	2022-06-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Computational Neuroscience
spelling	doaj.art-c2ac593b9b344bbe8a4b241a214e663c2022-12-22T00:28:42ZengFrontiers Media S.A.Frontiers in Computational Neuroscience1662-51882022-06-011610.3389/fncom.2022.784604784604Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task EventsKazuhiro Sakamoto0Kazuhiro Sakamoto1Hinata Yamada2Norihiko Kawaguchi3Yoshito Furusawa4Naohiro Saito5Hajime Mushiake6Department of Neuroscience, Faculty of Medicine, Tohoku Medical and Pharmaceutical University, Sendai, JapanDepartment of Physiology, Tohoku University School of Medicine, Sendai, JapanDepartment of Neuroscience, Faculty of Medicine, Tohoku Medical and Pharmaceutical University, Sendai, JapanDepartment of Physiology, Tohoku University School of Medicine, Sendai, JapanDepartment of Physiology, Tohoku University School of Medicine, Sendai, JapanDepartment of Physiology, Tohoku University School of Medicine, Sendai, JapanDepartment of Physiology, Tohoku University School of Medicine, Sendai, JapanLearning is a crucial basis for biological systems to adapt to environments. Environments include various states or episodes, and episode-dependent learning is essential in adaptation to such complex situations. Here, we developed a model for learning a two-target search task used in primate physiological experiments. In the task, the agent is required to gaze one of the four presented light spots. Two neighboring spots are served as the correct target alternately, and the correct target pair is switched after a certain number of consecutive successes. In order for the agent to obtain rewards with a high probability, it is necessary to make decisions based on the actions and results of the previous two trials. Our previous work achieved this by using a dynamic state space. However, to learn a task that includes events such as fixation to the initial central spot, the model framework should be extended. For this purpose, here we propose a “history-in-episode architecture.” Specifically, we divide states into episodes and histories, and actions are selected based on the histories within each episode. When we compared the proposed model including the dynamic state space with the conventional SARSA method in the two-target search task, the former performed close to the theoretical optimum, while the latter never achieved target-pair switch because it had to re-learn each correct target each time. The reinforcement learning model including the proposed history-in-episode architecture and dynamic state scape enables episode-dependent learning and provides a basis for highly adaptable learning systems to complex environments.https://www.frontiersin.org/articles/10.3389/fncom.2022.784604/fullreinforcement learningtarget search taskdynamic state spaceepisode-dependent learninghistory-in-episode architecture
spellingShingle	Kazuhiro Sakamoto Kazuhiro Sakamoto Hinata Yamada Norihiko Kawaguchi Yoshito Furusawa Naohiro Saito Hajime Mushiake Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events Frontiers in Computational Neuroscience reinforcement learning target search task dynamic state space episode-dependent learning history-in-episode architecture
title	Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events
title_full	Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events
title_fullStr	Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events
title_full_unstemmed	Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events
title_short	Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events
title_sort	reinforcement learning model with dynamic state space tested on target search tasks for monkeys extension to learning task events
topic	reinforcement learning target search task dynamic state space episode-dependent learning history-in-episode architecture
url	https://www.frontiersin.org/articles/10.3389/fncom.2022.784604/full
work_keys_str_mv	AT kazuhirosakamoto reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT kazuhirosakamoto reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT hinatayamada reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT norihikokawaguchi reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT yoshitofurusawa reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT naohirosaito reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents AT hajimemushiake reinforcementlearningmodelwithdynamicstatespacetestedontargetsearchtasksformonkeysextensiontolearningtaskevents

Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events

Similar Items