Two bar-press experiments with rats tested the rule that reducing expectation of reward increases the variation from which reward selects. Experiment 1 used a discrete-trial random-interval schedule, with trials signalled by light or sound. One signal always ended with reward; the other ended with reward less often. The two signals were randomly mixed. Bar-press duration (how long the bar is held down) varied more during the signal with the lower probability of reward. Experiment 2 closely resembled Experiment 1 but used a random-ratio schedule rather than a random-interval schedule.Again, bar-press duration varied more during the signal with the lower probability of reward. The results support the rule-the first well-controlled comparisons to do so.
Control of Variation by Reward Probability 29 March 2004 version 3
Control of Variation by Reward ProbabilityInstrumental learning requires both variation and selection --variation of action, selection by reward of what works (Staddon & Simmelhag, 1971;Hull, Langman, & Glenn, 2001) --but few experiments have studied variation (Chance, 1999;Domjan, 1998;Lieberman, 1990;Pearce, 1997;Staddon & Cerutti, 2003). Yet it is plausible that variation depends on recent events, just as response rate does. If an animal's actions vary too little, it will not find better ways of doing things; if they vary too much, rewarded actions will not be repeated. So at any time there is an optimal amount of variation, which changes as the costs and benefits of variation change. Animals that do instrumental learning would profit from a mechanism that regulates variation so that the actual amount is close to the optimal amount. If such a mechanism exists, we know little about it.Experiments about variation have been rare partly because variation has been hard to measure. Antonitis (1951), for example, took "6,600 photographs of nosethrusting responses" (p. 275) to study variation in nose position. Herrnstein (1961) used a special apparatus with ten switches to measure the location of key pecks.Although most studies of variation have measured spatial variation (e.g., Balsam, Deich, Ohyama, & Stokes, 1998;Eckerman & Lanson, 1969;Ferraro & Branch, 1968;Neuringer, 2002), temporal variation may be much easier to measure. A few studies (Lacter & Corey, 1982;Margulies, 1961;Millenson, Hurwitz, & Nixon, 1961) have measured the variation of bar -press duration (how long the bar is held down). Rats can be trained to make bar presses of specified durations (e.g., Notterman & Mintz, 1965;
Control of Variation by Reward Probability 29 March 2004 version 4Platt, Kuch, & Bitgood, 1973), so the variation measured by bar-press duration includes the variation from which reward selects.Our use of bar-press duration to study variation began with a puzzle involving rats trained with the peak procedure (Gharib, Derby, & Roberts, 2001). On most trials, food was given for the first bar press more than 40 sec after the start of the trial, at which point the trial ended. On some trials, however, no food was gi...