When experimenters require their subjects to perform some readily recorded response to gain access to discriminative stimuli but do not permit this behavior to alter the schedule of reinforcement, the response is classified, by analogy, as an “observing” response. Observing responses have been used not only to analyze discrimination learning but also to substantiate the concept of conditioned reinforcement and to measure the reinforcing effect of stimuli serving other behavioral functions. A controversy, however, centers around the puzzling question of how observing can be sustained when the resulting stimuli are not associated with any increase in the frequency of primary reinforcement. Two possible answers have been advanced: (a) that differential preparatory responses to these stimuli as conditional stimuli make both the receipt and the nonreceipt of unconditional stimuli more reinforcing; and (b) that information concerning biologically significant events is inherently reinforcing. It appears, however, that the stimulus associated with the less desirable outcome is not reinforcing. The maintenance of observing can be reconciled with the traditional theory that the acquisition of reinforcing properties proceeds according to the same rules as those for Pavlovian conditioning if it is recognized that the subject is selective in what it observes and procures a greater than proportionate exposure to the stimulus associated with the more desirable outcome. As a result of this selection, the overall frequency of primary reinforcement increases in the presence of the observed stimuli and declines in the presence of the nondifferential stimuli that prevail when the subject is not observing.
Early theorists (Skinner, Spence) interpreted discrimination learning in terms of the strengthening of the response to one stimulus and its weakening to the other. But this analysis does not account for the increasing independence of the two performances as training continues or for increases in control by dimensions of a stimulus other than the one used in training. Correlation of stimuli with different densities of reinforcement produces an increase in the behavior necessary to observe them, and greater observing of and attending to the relevant stimuli may account for the increase in control by these stimuli. The observing analysis also encompasses errorless training, and the selective nature of observing explains the feature-positive effect and the relatively shallow gradients of generalization generated by negative discriminative stimuli. The effectiveness of the observing analysis in handling these special cases adds to the converging lines of evidence supporting its integrative power and thus its validity.
A molecular analysis based on the termination of stimuli that are positively correlated with shock and the production of stimuli that are negatively correlated with shock provides a parsimonious count for both traditional discrete-trial avoidance behavior and the data derived from more recent free-operant procedures. The necessary stimuli are provided by the intrinsic feedback generated by the subject's behavior, in addition to those presented by the experimenter. Moreover, all data compatible with the molar principle of shock-frequency reduction as reinforcement are also compatible with a delay-of-shock gradient, but some data compatible with the delay gradient are not compatible with frequency reduction. The delay gradient corresponds to functions relating magnitude of behavioral effect to the time between conditional and unconditional stimuli, the time between conditioned and primary reinforcers, and the time between responses and positive reinforcers.
Five pigeons were used to test the hypothesis that the source of reinforcement for observing behavior is the information that it provides concerning the schedule of primary reinforcement. On a variable-interval schedule, pecking the left-hand key produced a 30-sec display of such information. During this 30-sec period, when pecking the right-hand key was reinforced on a random-interval schedule, both keys were green; when no reinforcement was scheduled (extinction) both keys were red. Later, this baseline procedure, in which both red and green were available, was replaced for blocks of sessions by procedures in which either (a) the red was eliminated and only the green could be produced; or (b) the green was eliminated and only the red could be produced. The results were that green maintained rates of pecking on the left key that were as high or higher than when both colors were available and that red maintained no responding. It was concluded that the reinforcing value of a stimulus depends on the positive or negative direction of its correlation with primary reinforcement, rather than upon the amount of information that it conveys.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.