Optimal Temporal Risk Assessment

2015

Psychon Bull Rev

Previous studies showed that rats and pigeons can count their responses, and the resultant count-based judgments exhibit the scalar property (also known as Weber's Law), a psychophysical property that also characterizes intervaltiming behavior. Animals were found to take a nearly normative account of these well-established endogenous uncertainty characteristics in their time-based decision-making. On the other hand, no study has yet tested the implications of scalar property of numerosity representations for reward-rate maximization in count-based decision-making. The current study tested mice on a task that required them to press one lever for a minimum number of times before pressing the second lever to collect the armed reward (fixed consecutive number schedule, FCN). Fewer than necessary number of responses reset the response count without reinforcement, whereas emitting responses at least for the minimum number of times reset the response counter with reinforcement. Each mouse was tested with three different FCN schedules (FCN10, FCN20, FCN40). The number of responses emitted on the first lever before pressing the second lever constituted the main unit of analysis. Our findings for the first time showed that mice count their responses with scalar property. We then defined the reward-rate maximizing numerical decision strategies in this task based on the subject-based estimates of the endogenous counting uncertainty. Our results showed that mice learn to maximize the reward-rate by incorporating the uncertainty in their numerosity judgments into their count-based decisions. Our findings extend the scope of optimal temporal risk-assessment to the domain of count-based decision-making.

Section: Discussionsupporting

confidence: 91%

Section: Discussionsupporting

confidence: 61%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Mice can count and optimize count-based decisions

Çavdaroğlu

2015

Psychon Bull Rev

“…Findings from earlier studies that adopted such an analytical approach [8][9][10][11][12] showed that both humans and rodents can adopt nearly optimal strategies by taking account of their endogenous uncertainty in tasks that required various decisions. In one of these experiments [8], humans and mice were trained in a temporal discrimination task that was composed of two types of trials (short latency trial and long latency trial) presented probabilistically.…”

Section: Introductionmentioning

confidence: 99%

Time-based reward maximization

Çavdaroğlu

Zeki

2014

Phil. Trans. R. Soc. B

Humans and animals time intervals from seconds to minutes with high accuracy but limited precision. Consequently, time-based decisions are inevitably subjected to our endogenous timing uncertainty, and thus require temporal risk assessment. In this study, we tested temporal risk assessment ability of humans when participants had to withhold each subsequent response for a minimum duration to earn reward and each response reset the trial time. Premature responses were not penalized in Experiment 1 but were penalized in Experiment 2. Participants tried to maximize reward within a fixed session time (over eight sessions) by pressing a key. No instructions were provided regarding the task rules/parameters. We evaluated empirical performance within the framework of optimality that was based on the level of endogenous timing uncertainty and the payoff structure. Participants nearly tracked the optimal target inter-response times (IRTs) that changed as a function of the level of timing uncertainty and maximized the reward rate in both experiments. Acquisition of optimal target IRT was rapid and abrupt without any further improvement or worsening. These results constitute an example of optimal temporal risk assessment performance in a task that required finding the optimal trade-off between the ‘speed’ (timing) and ‘accuracy’ (reward probability) of timed responses for reward maximization.

“…The resultant timed behaviors have been shown to be sensitive to other crucial elements of environmental statistics, such as the probabilities of different outcomes (2)(3)(4). However, how steady-state choice behavior emerges in temporal decision-making tasks that contain probabilistic contingencies remains to be answered.…”

mentioning

confidence: 99%

Mice plan decision strategies based on previously learned time intervals, locations, and probabilities

Tosun

Gür

Proc. Natl. Acad. Sci. U.S.A.

2016

Animals can shape their timed behaviors based on experienced probabilistic relations in a nearly optimal fashion. On the other hand, it is not clear if they adopt these timed decisions by making computations based on previously learnt task parameters (time intervals, locations, and probabilities) or if they gradually develop their decisions based on trial and error. To address this question, we tested mice in the timed-switching task, which required them to anticipate when (after a short or long delay) and at which of the two delay locations a reward would be presented. The probability of short trials differed between test groups in two experiments. Critically, we first trained mice on relevant task parameters by signaling the active trial with a discriminative stimulus and delivered the corresponding reward after the associated delay without any response requirement (without inducing switching behavior). During the test phase, both options were presented simultaneously to characterize the emergence and temporal characteristics of the switching behavior. Mice exhibited timed-switching behavior starting from the first few test trials, and their performance remained stable throughout testing in the majority of the conditions. Furthermore, as the probability of the short trial increased, mice waited longer before switching from the short to long location (experiment 1). These behavioral adjustments were in directions predicted by reward maximization. These results suggest that rather than gradually adjusting their time-dependent choice behavior, mice abruptly adopted temporal decision strategies by directly integrating their previous knowledge of task parameters into their timed behavior, supporting the model-based representational account of temporal risk assessment.decision making | interval timing | temporal risk assessment | probabilities | mice M any vertebrate species can build temporal expectancies and cluster their anticipatory behaviors around intervals that lead to critical outcomes (1). The resultant timed behaviors have been shown to be sensitive to other crucial elements of environmental statistics, such as the probabilities of different outcomes (2-4). However, how steady-state choice behavior emerges in temporal decision-making tasks that contain probabilistic contingencies remains to be answered. Do animals directly manifest their knowledge of quantities (e.g., time interval, probability) and locations in their timed behavior or do they gradually acquire differential timed response patterns based on reinforcement learning in a fashion stripped of representations? This study aimed to address this fundamental question using a simple temporal decisionmaking task.A class of interval timing paradigms requires the animals to distribute their responses between two (short vs. long latency) options, each of which predicts reward after the corresponding fixed delay. In these cases, the emergent response pattern is first behaviorally investing in the option with a short delay to the reward, and if responding a...