Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration

When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward...

Full description

Bibliographic Details
Main Authors:	Wang, Y, Lak, A, Manohar, SG, Bogacz, R
Format:	Journal article
Language:	English
Published:	Public Library of Science 2024

_version_	1826313185623277568
author	Wang, Y Lak, A Manohar, SG Bogacz, R
author_facet	Wang, Y Lak, A Manohar, SG Bogacz, R
author_sort	Wang, Y
collection	OXFORD
description	When facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning.
first_indexed	2024-09-25T04:09:11Z
format	Journal article
id	oxford-uuid:9759dee1-c09e-4f14-951f-05c73b542390
institution	University of Oxford
language	English
last_indexed	2024-09-25T04:09:11Z
publishDate	2024
publisher	Public Library of Science
record_format	dspace
spelling	oxford-uuid:9759dee1-c09e-4f14-951f-05c73b5423902024-06-06T13:40:40ZDopamine encoding of novelty facilitates efficient uncertainty-driven explorationJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:9759dee1-c09e-4f14-951f-05c73b542390EnglishSymplectic ElementsPublic Library of Science2024Wang, YLak, AManohar, SGBogacz, RWhen facing an unfamiliar environment, animals need to explore to gain new knowledge about which actions provide reward, but also put the newly acquired knowledge to use as quickly as possible. Optimal reinforcement learning strategies should therefore assess the uncertainties of these action-reward associations and utilise them to inform decision making. We propose a novel model whereby direct and indirect striatal pathways act together to estimate both the mean and variance of reward distributions, and mesolimbic dopaminergic neurons provide transient novelty signals, facilitating effective uncertainty-driven exploration. We utilised electrophysiological recording data to verify our model of the basal ganglia, and we fitted exploration strategies derived from the neural model to data from behavioural experiments. We also compared the performance of directed exploration strategies inspired by our basal ganglia model with other exploration algorithms including classic variants of upper confidence bound (UCB) strategy in simulation. The exploration strategies inspired by the basal ganglia model can achieve overall superior performance in simulation, and we found qualitatively similar results in fitting model to behavioural data compared with the fitting of more idealised normative models with less implementation level detail. Overall, our results suggest that transient dopamine levels in the basal ganglia that encode novelty could contribute to an uncertainty representation which efficiently drives exploration in reinforcement learning.
spellingShingle	Wang, Y Lak, A Manohar, SG Bogacz, R Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title	Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title_full	Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title_fullStr	Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title_full_unstemmed	Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title_short	Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration
title_sort	dopamine encoding of novelty facilitates efficient uncertainty driven exploration
work_keys_str_mv	AT wangy dopamineencodingofnoveltyfacilitatesefficientuncertaintydrivenexploration AT laka dopamineencodingofnoveltyfacilitatesefficientuncertaintydrivenexploration AT manoharsg dopamineencodingofnoveltyfacilitatesefficientuncertaintydrivenexploration AT bogaczr dopamineencodingofnoveltyfacilitatesefficientuncertaintydrivenexploration

Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration

Similar Items