Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice

The exploration/exploitation tradeoff – pursuing a known reward vs. sampling from lesser known options in the hope of finding a better payoff – is a fundamental aspect of learning and decision making. In humans, this has been studied using multi-armed bandit tasks. The same processes have also been...

Full description

Bibliographic Details
Main Authors: Jeremy A. Metha, Maddison L. Brian, Sara Oberrauch, Samuel A. Barnes, Travis J. Featherby, Peter Bossaerts, Carsten Murawski, Daniel Hoyer, Laura H. Jacobson
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-01-01
Series:Frontiers in Behavioral Neuroscience
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fnbeh.2019.00270/full
_version_ 1818501032299200512
author Jeremy A. Metha
Jeremy A. Metha
Jeremy A. Metha
Maddison L. Brian
Maddison L. Brian
Sara Oberrauch
Sara Oberrauch
Samuel A. Barnes
Travis J. Featherby
Peter Bossaerts
Carsten Murawski
Daniel Hoyer
Daniel Hoyer
Daniel Hoyer
Laura H. Jacobson
Laura H. Jacobson
author_facet Jeremy A. Metha
Jeremy A. Metha
Jeremy A. Metha
Maddison L. Brian
Maddison L. Brian
Sara Oberrauch
Sara Oberrauch
Samuel A. Barnes
Travis J. Featherby
Peter Bossaerts
Carsten Murawski
Daniel Hoyer
Daniel Hoyer
Daniel Hoyer
Laura H. Jacobson
Laura H. Jacobson
author_sort Jeremy A. Metha
collection DOAJ
description The exploration/exploitation tradeoff – pursuing a known reward vs. sampling from lesser known options in the hope of finding a better payoff – is a fundamental aspect of learning and decision making. In humans, this has been studied using multi-armed bandit tasks. The same processes have also been studied using simplified probabilistic reversal learning (PRL) tasks with binary choices. Our investigations suggest that protocols previously used to explore PRL in mice may prove beyond their cognitive capacities, with animals performing at a no-better-than-chance level. We sought a novel probabilistic learning task to improve behavioral responding in mice, whilst allowing the investigation of the exploration/exploitation tradeoff in decision making. To achieve this, we developed a two-lever operant chamber task with levers corresponding to different probabilities (high/low) of receiving a saccharin reward, reversing the reward contingencies associated with levers once animals reached a threshold of 80% responding at the high rewarding lever. We found that, unlike in existing PRL tasks, mice are able to learn and behave near optimally with 80% high/20% low reward probabilities. Altering the reward contingencies towards equality showed that some mice displayed preference for the high rewarding lever with probabilities as close as 60% high/40% low. Additionally, we show that animal choice behavior can be effectively modelled using reinforcement learning (RL) models incorporating learning rates for positive and negative prediction error, a perseveration parameter, and a noise parameter. This new decision task, coupled with RL analyses, advances access to investigate the neuroscience of the exploration/exploitation tradeoff in decision making.
first_indexed 2024-12-10T20:50:41Z
format Article
id doaj.art-fd08186da914445282450c444185d68f
institution Directory Open Access Journal
issn 1662-5153
language English
last_indexed 2024-12-10T20:50:41Z
publishDate 2020-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Behavioral Neuroscience
spelling doaj.art-fd08186da914445282450c444185d68f2022-12-22T01:34:06ZengFrontiers Media S.A.Frontiers in Behavioral Neuroscience1662-51532020-01-011310.3389/fnbeh.2019.00270488522Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for MiceJeremy A. Metha0Jeremy A. Metha1Jeremy A. Metha2Maddison L. Brian3Maddison L. Brian4Sara Oberrauch5Sara Oberrauch6Samuel A. Barnes7Travis J. Featherby8Peter Bossaerts9Carsten Murawski10Daniel Hoyer11Daniel Hoyer12Daniel Hoyer13Laura H. Jacobson14Laura H. Jacobson15Sleep and Cognition, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaTranslational Neuroscience, Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, AustraliaBrain, Mind and Markets Laboratory, Department of Finance, Faculty of Business and Economics, The University of Melbourne, Parkville, VIC, AustraliaSleep and Cognition, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaTranslational Neuroscience, Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, AustraliaSleep and Cognition, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaTranslational Neuroscience, Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, AustraliaDepartment of Psychiatry, School of Medicine, University of California, San Diego, La Jolla, CA, United StatesBehavioral Core, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaBrain, Mind and Markets Laboratory, Department of Finance, Faculty of Business and Economics, The University of Melbourne, Parkville, VIC, AustraliaBrain, Mind and Markets Laboratory, Department of Finance, Faculty of Business and Economics, The University of Melbourne, Parkville, VIC, AustraliaSleep and Cognition, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaTranslational Neuroscience, Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, AustraliaDepartment of Molecular Medicine, The Scripps Research Institute, La Jolla, CA, United StatesSleep and Cognition, The Florey Institute of Neuroscience and Mental Health, Parkville, VIC, AustraliaTranslational Neuroscience, Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, AustraliaThe exploration/exploitation tradeoff – pursuing a known reward vs. sampling from lesser known options in the hope of finding a better payoff – is a fundamental aspect of learning and decision making. In humans, this has been studied using multi-armed bandit tasks. The same processes have also been studied using simplified probabilistic reversal learning (PRL) tasks with binary choices. Our investigations suggest that protocols previously used to explore PRL in mice may prove beyond their cognitive capacities, with animals performing at a no-better-than-chance level. We sought a novel probabilistic learning task to improve behavioral responding in mice, whilst allowing the investigation of the exploration/exploitation tradeoff in decision making. To achieve this, we developed a two-lever operant chamber task with levers corresponding to different probabilities (high/low) of receiving a saccharin reward, reversing the reward contingencies associated with levers once animals reached a threshold of 80% responding at the high rewarding lever. We found that, unlike in existing PRL tasks, mice are able to learn and behave near optimally with 80% high/20% low reward probabilities. Altering the reward contingencies towards equality showed that some mice displayed preference for the high rewarding lever with probabilities as close as 60% high/40% low. Additionally, we show that animal choice behavior can be effectively modelled using reinforcement learning (RL) models incorporating learning rates for positive and negative prediction error, a perseveration parameter, and a noise parameter. This new decision task, coupled with RL analyses, advances access to investigate the neuroscience of the exploration/exploitation tradeoff in decision making.https://www.frontiersin.org/article/10.3389/fnbeh.2019.00270/fullreinforcementprobabilisticdiscriminationreversallearningmouse
spellingShingle Jeremy A. Metha
Jeremy A. Metha
Jeremy A. Metha
Maddison L. Brian
Maddison L. Brian
Sara Oberrauch
Sara Oberrauch
Samuel A. Barnes
Travis J. Featherby
Peter Bossaerts
Carsten Murawski
Daniel Hoyer
Daniel Hoyer
Daniel Hoyer
Laura H. Jacobson
Laura H. Jacobson
Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
Frontiers in Behavioral Neuroscience
reinforcement
probabilistic
discrimination
reversal
learning
mouse
title Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
title_full Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
title_fullStr Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
title_full_unstemmed Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
title_short Separating Probability and Reversal Learning in a Novel Probabilistic Reversal Learning Task for Mice
title_sort separating probability and reversal learning in a novel probabilistic reversal learning task for mice
topic reinforcement
probabilistic
discrimination
reversal
learning
mouse
url https://www.frontiersin.org/article/10.3389/fnbeh.2019.00270/full
work_keys_str_mv AT jeremyametha separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT jeremyametha separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT jeremyametha separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT maddisonlbrian separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT maddisonlbrian separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT saraoberrauch separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT saraoberrauch separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT samuelabarnes separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT travisjfeatherby separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT peterbossaerts separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT carstenmurawski separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT danielhoyer separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT danielhoyer separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT danielhoyer separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT laurahjacobson separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice
AT laurahjacobson separatingprobabilityandreversallearninginanovelprobabilisticreversallearningtaskformice