Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.

Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the eff...

Full description

Bibliographic Details
Main Authors: Jaron T Colas, John P O'Doherty, Scott T Grafton
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2024-03-01
Series:PLoS Computational Biology
Online Access:https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011950&type=printable
_version_ 1797224071244546048
author Jaron T Colas
John P O'Doherty
Scott T Grafton
author_facet Jaron T Colas
John P O'Doherty
Scott T Grafton
author_sort Jaron T Colas
collection DOAJ
description Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.
first_indexed 2024-04-24T13:47:17Z
format Article
id doaj.art-8ce450de972b418b95c29d5c56a5889f
institution Directory Open Access Journal
issn 1553-734X
1553-7358
language English
last_indexed 2024-04-24T13:47:17Z
publishDate 2024-03-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj.art-8ce450de972b418b95c29d5c56a5889f2024-04-04T05:32:47ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582024-03-01203e101195010.1371/journal.pcbi.1011950Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.Jaron T ColasJohn P O'DohertyScott T GraftonActive reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants-even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011950&type=printable
spellingShingle Jaron T Colas
John P O'Doherty
Scott T Grafton
Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
PLoS Computational Biology
title Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
title_full Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
title_fullStr Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
title_full_unstemmed Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
title_short Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts.
title_sort active reinforcement learning versus action bias and hysteresis control with a mixture of experts and nonexperts
url https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011950&type=printable
work_keys_str_mv AT jarontcolas activereinforcementlearningversusactionbiasandhysteresiscontrolwithamixtureofexpertsandnonexperts
AT johnpodoherty activereinforcementlearningversusactionbiasandhysteresiscontrolwithamixtureofexpertsandnonexperts
AT scotttgrafton activereinforcementlearningversusactionbiasandhysteresiscontrolwithamixtureofexpertsandnonexperts