2022
DOI: 10.1073/pnas.2113961119
|View full text |Cite
|
Sign up to set email alerts
|

Mice exhibit stochastic and efficient action switching during probabilistic decision making

Abstract: Significance To obtain rewards in changing and uncertain environments, animals must adapt their behavior. We found that mouse choice and trial-to-trial switching behavior in a dynamic and probabilistic two-choice task could be modeled by equivalent theoretical, algorithmic, and descriptive models. These models capture components of evidence accumulation, choice history bias, and stochasticity in mouse behavior. Furthermore, they reveal that mice adapt their behavior in different environmental context… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

3
76
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 44 publications
(79 citation statements)
references
References 51 publications
3
76
0
Order By: Relevance
“…In natural environments, animals often have to accumulate rewards by making choices between options that are unreliable predictors of these rewards 1,2 . Studies in both vertebrates and invertebrates have found that animals employ decision-making strategies that account for this unpredictability, allowing them to maximize reward accumulation in a variety of dynamic and unpredictable environments [3][4][5][6][7][8][9][10][11] . Three key features are typically thought to be involved in the successful strategies that have been identified: i) a representation of information relevant for making choices, such as option value; ii) a tally of past choices and rewards; and iii) a learning rule by which these two are combined to update the decision variables based on experience 10,[12][13][14] .…”
Section: Introductionmentioning
confidence: 99%
“…In natural environments, animals often have to accumulate rewards by making choices between options that are unreliable predictors of these rewards 1,2 . Studies in both vertebrates and invertebrates have found that animals employ decision-making strategies that account for this unpredictability, allowing them to maximize reward accumulation in a variety of dynamic and unpredictable environments [3][4][5][6][7][8][9][10][11] . Three key features are typically thought to be involved in the successful strategies that have been identified: i) a representation of information relevant for making choices, such as option value; ii) a tally of past choices and rewards; and iii) a learning rule by which these two are combined to update the decision variables based on experience 10,[12][13][14] .…”
Section: Introductionmentioning
confidence: 99%
“…To examine the local circuit interactions between striatal Ach and DA (Fig. 1a) during, and their contributions to, such behaviors, we monitored neuromodulator levels in mice performing a dynamic and probabilistic two-port choice task modeled after paradigms that engage striatal pathways and require striatal activity for optimal performance [47][48][49] . In this two-armed bandit task (2ABT), mice move freely within a box containing three ports (Fig.…”
Section: Striatal Ach and Da Levels Are Dynamically Regulated During ...mentioning
confidence: 99%
“…1e third column). Mouse behavior and evidence accumulation in the task can be summarized by a variety of models [49][50][51] . Here, we will employ a recursively-formulated logistic regression (RFLR, see methods) that was developed from a behavior task similar to the one we employ 49 .…”
Section: Striatal Ach and Da Levels Are Dynamically Regulated During ...mentioning
confidence: 99%
See 1 more Smart Citation
“…To illustrate this space, we focus on a well-studied task in which an animal, or agent, gathers rewards from two ports whose reward probabilities change dynamically over time according to a hidden world state [11, 12, 13, 14, 15, 16]. To improve performance on this task, the agent can leverage knowledge about the volatility of the environment and the probability of rewards at each port in order to update an internal estimate of the world state based on the outcomes of its past actions.…”
Section: Introductionmentioning
confidence: 99%