A one-shot shift from explore to exploit in monkey prefrontal cortex

Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Fr...

Full description

Bibliographic Details
Main Authors:	Achterberg, J, Kadohisa, M, Watanabe, K, Kusunoki, M, Buckley, MJ, Duncan, J
Format:	Journal article
Language:	English
Published:	Society for Neuroscience 2021

_version_	1826278506008412160
author	Achterberg, J Kadohisa, M Watanabe, K Kusunoki, M Buckley, MJ Duncan, J
author_facet	Achterberg, J Kadohisa, M Watanabe, K Kusunoki, M Buckley, MJ Duncan, J
author_sort	Achterberg, J
collection	OXFORD
description	Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in an explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to re-select them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference aligned with the one-shot learning in the monkeys’ behavior: the population switched from an explore state to an exploit state after a single trial of learning, but partially maintained the explore state if an error indicated that learning had failed. Binary switch from explore to exploit was not explained by continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial, object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate processes of explore and exploit, switching in one trial between the two.
first_indexed	2024-03-06T23:44:55Z
format	Journal article
id	oxford-uuid:70915a8e-aa71-4d81-8957-a0a9570543de
institution	University of Oxford
language	English
last_indexed	2024-03-06T23:44:55Z
publishDate	2021
publisher	Society for Neuroscience
record_format	dspace
spelling	oxford-uuid:70915a8e-aa71-4d81-8957-a0a9570543de2022-03-26T19:38:06ZA one-shot shift from explore to exploit in monkey prefrontal cortexJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:70915a8e-aa71-4d81-8957-a0a9570543deEnglishSymplectic ElementsSociety for Neuroscience2021Achterberg, JKadohisa, MWatanabe, KKusunoki, MBuckley, MJDuncan, JMuch animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in an explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to re-select them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference aligned with the one-shot learning in the monkeys’ behavior: the population switched from an explore state to an exploit state after a single trial of learning, but partially maintained the explore state if an error indicated that learning had failed. Binary switch from explore to exploit was not explained by continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial, object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate processes of explore and exploit, switching in one trial between the two.
spellingShingle	Achterberg, J Kadohisa, M Watanabe, K Kusunoki, M Buckley, MJ Duncan, J A one-shot shift from explore to exploit in monkey prefrontal cortex
title	A one-shot shift from explore to exploit in monkey prefrontal cortex
title_full	A one-shot shift from explore to exploit in monkey prefrontal cortex
title_fullStr	A one-shot shift from explore to exploit in monkey prefrontal cortex
title_full_unstemmed	A one-shot shift from explore to exploit in monkey prefrontal cortex
title_short	A one-shot shift from explore to exploit in monkey prefrontal cortex
title_sort	one shot shift from explore to exploit in monkey prefrontal cortex
work_keys_str_mv	AT achterbergj aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT kadohisam aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT watanabek aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT kusunokim aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT buckleymj aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT duncanj aoneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT achterbergj oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT kadohisam oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT watanabek oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT kusunokim oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT buckleymj oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex AT duncanj oneshotshiftfromexploretoexploitinmonkeyprefrontalcortex

A one-shot shift from explore to exploit in monkey prefrontal cortex

Similar Items