Two procedures commonly used to study choice are concurrent reinforcement and probability learning. Under concurrent-reinforcement procedures, once a reinforcer is scheduled, it remains available indefinitely until collected. Therefore reinforcement becomes increasingly likely with passage of time or responses on other operanda. Under probability learning, reinforcer probabilities are constant and independent of passage of time or responses. Therefore a particular reinforcer is gained or not, on the basis of a single response, and potential reinforcers are not retained, as when betting at a roulette wheel. In the "real" world, continued availability of reinforcers often lies between these two extremes, with potential reinforcers being lost owing to competition, maturation, decay, and random scatter. The authors parametrically manipulated the likelihood of continued reinforcer availability, defined as hold, and examined the effects on pigeons' choices. Choices varied as power functions of obtained reinforcers under all values of hold. Stochastic models provided generally good descriptions of choice emissions with deviations from stochasticity systematically related to hold. Thus, a single set of principles accounted for choices across hold values that represent a wide range of real-world conditions. Keywords matching; stochastic response; limited hold; reinforcement probability; pigeons Probability learning and concurrent reinforcement are procedures commonly used to study choice. They share many similarities. For example, subjects choose repeatedly between two options to obtain reinforcers: Human participants press buttons or push computer keys for points or money; monkeys look left or right for orange juice or raisins; rats run down alleys or press levers for food pellets; and pigeons peck keys for grain. Choice distributions under both procedures are influenced by reinforcer distributions and by other attributes of the reinforcers, for example, qualities, amounts, and delays. However, aspects of the procedures differ, and there is controversy concerning results, especially whether results from the two procedures are consistent.Under probability-learning procedures, reinforcer availability depends on a random number generator that is activated (or fired) each time a choice occurs. Thus each choice is reinforced with some probability, and the probabilities are independent of previous events. For example, a left (L) choice might be reinforced with 0.4 probability and a right (R) one with 0.1 probability. In some studies the resulting response distributions are described as probability matching: Ratios of responses (L/R) equal ratios of reinforcement probabilities, or 4 to 1 in the example given (Myers, Lohmeier, & Well, 1994;Vulkan, 2000). Such matching is inefficient:Correspondence concerning this article should be addressed to Allen Neuringer, Psychology Department, Reed College, Portland, OR 97202. E-mail: E-mail: allen.neuringer@reed.edu.
NIH Public Access
NIH-PA Author ManuscriptNIH-PA Author Manusc...