Rats responded on 2 levers delivering brain stimulation reward on concurrent variable interval schedules. Following many successive sessions with unchanging relative rates of reward, subjects adjusted to an eventual change slowly and showed spontaneous reversions at the beginning of subsequent sessions. When changes in rates of reward occurred between and within every session, subjects adjusted to them about as rapidly as they could in principle do so, as shown by comparison to a Bayesian model of an ideal detector. This and other features of the adjustments to frequent changes imply that the behavioral effect of reinforcement depends on the subject's perception of incomes and changes in incomes rather than on the strengthening and weakening of behaviors in accord with their past effects or expected results. Models for the process by which perceived incomes determine stay durations and for the process that detects changes in rates are developed.
Rats responded on concurrent variable interval schedules of brain stimulation reward in 2-trial sessions. Between trials, there was a 16-fold reversal in the relative rate of reward. In successive, narrow time windows, the authors compared the ratio of the times spent on the 2 levers to the ratio of the rewards received. Time-allocation ratios tracked wide, random fluctuations in the reward ratio. The adjustment to the midsession reversal in relative rate of reward was largely completed within 1 interreward interval on the leaner schedule. Both results were unaffected by a 16-fold change in the combined rates of reward. The large, rapid, scale-invariant shifts in time-allocation ratios that underlie matching behavior imply that the subjective relative rate of reward can be determined by a very few of the most recent interreward intervals and that this estimate can directly determine the ratio of the expected stay durations.
The growth of the subjective reward magnitude of medial forebrain bundle stimulation in the rat (Sprague-Dawley) as a function of train duration and pulse frequency was measured in 2 ways: (a) a titration method, which used differences in rate of reward on 2 levers to compensate for differences in the magnitude of the rewards; and (b) a direct method, in which the ratio of the reward magnitudes at the 2 levers was assumed to be given by the ratio of times spent on each lever. The results of the 2 methods agree. Reward magnitude grows as approximately a power function of train duration up to train durations of about 1 s, then declines somewhat over the interval from 2-20 s. The exponent of growth varies from 0.4 to 2.3. With stronger stimulation (higher pulse frequency), peak reward magnitude is bigger, but the saturating train duration is approximately the same.
Gibbon (1995) elaborated an ingenious model of matching, a feedforward model that is consistent with Heyman's (1982) suggestion that matching behavior does not depend on selection by consequences. Most models (for example, Herrnstein & Vaughan, 1980) have been feedback models, built on the law of effect. Measurements of how rapidly rats adjust to changes in the relative rates of brain stimulation reward on concurrent random interval schedules imply a feedforward process. The adjustments are, however, too fast to be consistent with Gibbon's model. John Gibbon pioneered the psychophysical study of interval timing and the application of information-processing models to our understanding of conditioned behavior. Among his many, highly original contributions was a model of matching behavior (Gibbon, 1995), which differed in a fundamental way from previous models. The difference has potentially far reaching implications for our understanding of instrumentally conditioned behavior. Unlike most previous models, Gibbon's model does not assume that the consequences of previous responses feed back to affect the relative strengths of competing behaviors (for a review of models of this type, see Lea & Dow, 1984). Gibbon's model is a purely feedforward model. The experience of different intervals between rewards elicits stay durations inversely proportionate to the ratio of those intervals, without regard to the effect that the animal's behavior has on those intervals.The law of effect ought to apply with exceptional directness when subjects are given a matching protocol. Thorndike (1911, p. 244) wrote ''The Law of Effect is that: Of several responses made to the same situation, those 46 0023-9690/02 $35.00
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.