Inference using significance testing and Bayes factors is compared and contrasted in five case studies based on real research. The first study illustrates that the methods will often agree, both in motivating researchers to conclude that H1 is supported better than H0, and the other way round, that H0 is better supported than H1. The next four, however, show that the methods will also often disagree. In these cases, the aim of the paper will be to motivate the sensible evidential conclusion, and then see which approach matches those intuitions. Specifically, it is shown that a high-powered non-significant result is consistent with no evidence for H0 over H1 worth mentioning, which a Bayes factor can show, and, conversely, that a low-powered non-significant result is consistent with substantial evidence for H0 over H1, again indicated by Bayesian analyses. The fourth study illustrates that a high-powered significant result may not amount to any evidence for H1 over H0, matching the Bayesian conclusion. Finally, the fifth study illustrates that different theories can be evidentially supported to different degrees by the same data; a fact that P-values cannot reflect but Bayes factors can. It is argued that appropriate conclusions match the Bayesian inferences, but not those based on significance testing, where they disagree.
Researchers often conclude an effect is absent when a null-hypothesis significance test yields a non-significant p-value. However, it is neither logically nor statistically correct to conclude an effect is absent when a hypothesis test is not significant. We present two methods to evaluate the presence or absence of effects: Equivalence testing (based on frequentist statistics) and Bayes factors (based on Bayesian statistics). In four examples from the gerontology literature we illustrate different ways to specify alternative models that can be used to reject the presence of a meaningful or predicted effect in hypothesis tests. We provide detailed explanations of how to calculate, report, and interpret Bayes factors and equivalence tests. We also discuss how to design informative studies that can provide support for a null model or for the absence of a meaningful effect. The conceptual differences between Bayes factors and equivalence tests are discussed, and we also note when and why they might lead to similar or different inferences in practice. It is important that researchers are able to falsify predictions or can quantify the support for predicted null-effects. Bayes factors and equivalence tests provide useful statistical tools to improve inferences about null effects.
The rubber hand illusion is one reliable way to experimentally manipulate the experience of body ownership. However, debate continues about the necessary and sufficient conditions eliciting the illusion. We measured proprioceptive drift and the subjective experience (via questionnaire) while manipulating two variables that have been suggested to affect the intensity of the illusion. First, the rubber hand was positioned either in a posturally congruent position, or rotated by 180°. Second, either the anatomically same rubber hand was used, or an anatomically incongruent one. We found in two independent experiments that a rubber hand rotated by 180° leads to increased proprioceptive drift during synchronous visuo-tactile stroking, although it does not lead to feelings of ownership (as measured by questionnaire). This dissociation between drift and ownership suggests that proprioceptive drift is not necessarily a valid proxy for the illusion when using hands rotated by 180°.
The ability to respond to hypnotic suggestibility (hypnotizability) is a stable trait which can be measured in a standardized procedure consisting of a hypnotic induction and a series of hypnotic suggestions. The SWASH is a 10-item adaptation of an established scale, the Waterloo-Stanford Group C Scale of Hypnotic Suggestibility (WSGC). Development of the SWASH was motivated by three distinct aims: to reduce required screening time, to provide an induction which more accurately reflects current theoretical understanding and to supplement the objective scoring with experiential scoring. Screening time was reduced by shortening the induction, removing two suggestions which may cause distress (dream and age regression) and by modifications which allow administration in lecture theatres, so that more participants can be screened simultaneously. Theoretical issues were addressed by removing references to sleep, absorption and eye fixation and closure. Data from 418 participants at the University of Sussex and the Lancaster University are presented, along with data from 66 participants who completed a retest screening. The subjective and objective scales were highly correlated. The subjective scale showed good reliability and objective scale reliability was comparable to the WSGC. The addition of subjective scale responses to the post-hypnotic suggestion (PHS) item suggested a high probability that responses to PHS are inflated in WSGC screening. The SWASH is an effective measure of hypnotizability, which reflects changes in conscious experience and presents practical and theoretical advantages over existing scales.
The self-concept maintenance theory holds that many people will cheat in order to maximize self-profit, but only to the extent that they can do so while maintaining a positive self-concept. Mazar, Amir, and Ariely (2008, Experiment 1) gave participants an opportunity and incentive to cheat on a problem-solving task. Prior to that task, participants either recalled the Ten Commandments (a moral reminder) or recalled 10 books they had read in high school (a neutral task). Results were consistent with the self-concept maintenance theory. When given the opportunity to cheat, participants given the moral-reminder priming task reported solving 1.45 fewer matrices than did those given a neutral prime (Cohen's d = 0.48); moral reminders reduced cheating. Mazar et al.'s article is among the most cited in deception research, but their Experiment 1 has not been replicated directly. This Registered Replication Report describes the aggregated result of 25 direct replications (total N = 5,786), all of which followed the same preregistered protocol. In the primary meta-analysis (19 replications, total n = 4,674), participants who were given an opportunity
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.