Summary: | Dopaminergic neurons in the mammalian substantia nigra displaycharacteristic phasic responses to stimuli which reliably predict thereceipt of primary rewards. These responses have been suggested toencode reward prediction-errors similar to those used in reinforcementlearning. Here, we propose a model of dopaminergic activity in whichprediction error signals are generated by the joint action ofshort-latency excitation and long-latency inhibition, in a networkundergoing dopaminergic neuromodulation of both spike-timing dependentsynaptic plasticity and neuronal excitability. In contrast toprevious models, sensitivity to recent events is maintained by theselective modification of specific striatal synapses, efferent tocortical neurons exhibiting stimulus-specific, temporally extendedactivity patterns. Our model shows, in the presence of significantbackground activity, (i) a shift in dopaminergic response from rewardto reward predicting stimuli, (ii) preservation of a response tounexpected rewards, and (iii) a precisely-timed below-baseline dip inactivity observed when expected rewards are omitted.
|