The statistic p rep estimates the probability of replicating an effect. It captures traditional publication criteria for signal-to-noise ratio, while avoiding parametric inference and the resulting Bayesian dilemma. In concert with effect size and replication intervals, p rep provides all of the information now used in evaluating research, while avoiding many of the pitfalls of traditional statistical inference.Psychologists, who rightly pride themselves on their methodological expertise, have become increasingly embarrassed by "the survival of a flawed method" (Krueger, 2001) at the heart of their inferential procedures. Null-hypothesis significance tests (NHSTs) provide criteria for separating signal from noise in the majority of published research. They are based on inferred sampling distributions, given a hypothetical value for a parameter such as a population mean (μ) or difference of means between an experimental group (μ E ) and a control group (μ C ; e.g., H 0 : μ E − μ C = 0). Analysis starts with a statistic on the obtained data, such as the difference in the sample means, D. D is a point on the line with probability mass of zero. It is necessary to relate that point to some interval in order to engage probability theory. Neyman and Pearson (1933) introduced critical intervals over which the probability of observing a statistic is less than a stipulated significance level, α (e.g., z scores between [−∞, −2] and between [+2, +∞] over which α < .05). If a statistic falls within those intervals, it is deemed significantly different from that expected under the null hypothesis. Fisher (1959) preferred to calculate the probability of obtaining a statistic larger than |D| over the interval [|D|, ∞]. This probability, p(x ≥ D|H 0 ), is called the p value of the statistic. Researchers typically hope to obtain a p value sufficiently small (viz. less than α) so that they can reject the null hypothesis. This is where problems arise. Fisher (1959), who introduced NHST, knew that "such a test of significance does not authorize us to make any statement about the hypothesis in question in terms of mathematical probability" (p. 35). This is because such statements concern p(H 0 | x ≥ D), which does not generally equal p(x ≥ D|H 0 ). The confusion of one conditional for the other is analogous to the conversion fallacy in propositional logic. Bayes showed thatThe unconditional probabilities are the priors, and are largely unknowable. Fisher (1959) allowed that p(x ≥ D|H 0 ) may "influence [the null's] acceptability" (p. 43). Unfortunately, absent priors, "P values can be highly misleading measures of the evidence provided by the data against the null hypothesis" (Berger & Selke, 1987, p. 112; also see Nickerson, 2000, p. 248). This constitutes a dilemma: On the one hand, "a test of significance contains no criterion for 'accepting' a hypothesis" (Fisher, 1959, p. 42), and on the other, we cannot safely reject a hypothesis without knowing the priors. Significance tests without priors are the "flaw in our method...
Effective conditioning requires a correlation between the experimenter's definition of a response and an organism's, but an animal's perception of its behavior differs from ours. These experiments explore various definitions of the response, using the slopes of learning curves to infer which comes closest to the organism's definition. The resulting exponentially weighted moving average provides a model of memory that is used to ground a quantitative theory of reinforcement. The theory assumes that: incentives excite behavior and focus the excitement on responses that are contemporaneous in memory. The correlation between the organism's memory and the behavior measured by the experimenter is given by coupling coefficients, which are derived for various schedules of reinforcement. The coupling coefficients for simple schedules may be concatenated to predict the effects of complex schedules. The coefficients are inserted into a generic model of arousal and temporal constraint to predict response rates under any scheduling arrangement. The theory posits a response-indexed decay of memory, not a time-indexed one. It requires that incentives displace memory for the responses that occur before them, and may truncate the representation of the response that brings them about. As a contiguity-weighted correlation model, it bridges opposing views of the reinforcement process. By placing the short-term memory of behavior in so central a role, it provides a behavioral account of a key cognitive process.
How is it that counting to ourselves helps us to estimate an interval of time? To address this question, we develop a generalized clock-counter model of duration discrimination that allows error in both the timing and the counting processes. We show that in order to minimize variability in temporal judgments, it is usually to the subject's advantage to segment the interval to be judged into subintervals. The optima] duration of the subintervals will depend on the parameters of the fundamental error equations that relate variance to the duration and number of the subintervals; in most cases, however, the optimal duration will be independent of the duration of the interval to be timed. The canonical form of the Weber function derived from our analysis takes as special cases the forms predicted by various other models of temporal discrimination. For long intervals it reduces to Weber's law, with the constant in that law solely a function of counting error.
In a two-link, concurrent-chain schedule, pigeons' pecks on each key during the initial link occasionally produced a terminal link, during which only that key was operative. Responses in the terminal link were reinforced with food on either fixed-interval or variable-interval schedules. In one experiment, relative amount of responding in the initial link equaled the relative harmonic rate of reinforcement in the terminal links. In a second experiment, the selection of interreinforcement intervals in variable-interval schedules in the terminal links was such that rates of reinforcement based on the harmonic or on the arithmetic means of the interreinforcement intervals predicted opposite preferences in the initial links. The observed preference was consistent with that predicted by the harmonic rather than by the arithmetic rates of reinforcement.When primary reinforcement is delivered on concurrent variable-interval schedules, differential changes in some dimensions of reinforcement, such as amount or delay, often produce proportional changes in the number of responses on either schedule (Catania, 1963;Chung and Herrnstein, 1967 Although these studies showed that preference depends on the temporal distribution of reinforcements, there was no consensus as to how reinforcement frequency should be calculated in order to achieve matching. Autor and Herrnstein, who used variable-interval and variable-ratio schedules of primary reinforcement, measured frequency as the reciprocal of the arithmetic mean of the interreinforcement intervals. Fantino, who used fixed-ratio (FR) and mixed-ratio schedules of primary reinforcement, measured frequency as the reciprocal of the geometric mean of the interreinforcement intervals. When Herrnstein (1964b) studied preference for variableinterval (VI) vs fixed-interval (FI) schedules, he could find no simple transformation on the distribution of interreinforcement intervals that would cause preference to match relative frequency of reinforcement.The problem of designating the correct measure of reinforcement frequency is a basic one. It entails first the decision of criteria for a "good" measure, and second, a technique for finding a transformation which most closely satisfies those criteria. Certainly a necessary criterion for any measure of reinforcement frequency, when this is assumed to be the controlling variable, is the following: whenever an organism is indifferent between different schedules of reinforcement, appropri-263
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.