Lauren Skorb scite author profile

Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect ( p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3–9; median total sample = 1,279.5, range = 276–3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (Δ r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols ( r = .05) was similar to that of the RP:P protocols ( r = .04) and the original RP:P replications ( r = .11), and smaller than that of the original studies ( r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00–.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19–.50).

show abstract

The Meta-Science of Adult Statistical Word Segmentation: Part 1

Hartshorne

Skorb

Dietz

et al. 2019

View full text Add to dashboard Cite

We report the first set of results in a multi-year project to assess the robustness – and the factors promoting robustness – of the adult statistical word segmentation literature. This includes eight total experiments replicating six different experiments. The purpose of these replications is to assess the reproducibility of reported experiments, examine the replicability of their results, and provide more accurate effect size estimates. Reproducibility was mixed, with several papers either lacking crucial details or containing errors in the description of method, making it difficult to ascertain what was done. Replicability was also mixed: although in every instance we confirmed above-chance statistical word segmentation, many theoretically important moderations of that learning failed to replicate. Moreover, learning success was generally much lower than in the original studies. In the General Discussion, we consider whether these differences are due to differences in subject populations, low power in the original studies, or some combination of these and other factors. We also consider whether these findings are likely to generalize to the broader statistical word segmentation literature.

show abstract

Many Labs 5: Replication of van Dijk, van Kleef, Steinel, and van Beest (2008)

Skorb

Aczél

Bakos

et al. 2020

Advances in Methods and Practices in Psychological Science

View full text Add to dashboard Cite

As part of the Many Labs 5 project, we ran a replication of van Dijk, van Kleef, Steinel, and van Beest’s (2008) study examining the effect of emotions in negotiations. They reported that when the consequences of rejection were low, subjects offered fewer chips to angry bargaining partners than to happy partners. We ran this replication under three protocols: the protocol used in the Reproducibility Project: Psychology, a revised protocol, and an online protocol. The effect averaged one ninth the size of the originally reported effect and was significant only for the revised protocol. However, the difference between the original and revised protocols was not significant.

show abstract

Many Labs 5: Testing pre-data collection peer review as an intervention to increase replicability

Ebersole¹,

Mathur²,

Baranski³

et al. 2019

Preprint

View full text Add to dashboard Cite

Replication efforts in psychological science sometimes fail to replicate prior findings. If replications use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the replication protocol rather than a challenge to the original finding. Formal pre-data collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replications from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) in which the original authors had expressed concerns about the replication designs before data collection and only one of which was “statistically significant” (p < .05). Commenters on RP:P suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these failed to replicate (Gilbert et al., 2016). We revised the replication protocols and received formal peer review prior to conducting new replications. We administered the RP:P and Revised replication protocols in multiple laboratories (Median number of laboratories per original study = XX; Range XX to YY; Median total sample = XX; Range XX to YY) for high-powered tests of each original finding with both protocols. Overall, XX of 10 RP:P protocols and XX of 10 Revised protocols showed significant evidence in the same direction as the original finding (p < .05), compared to an expected XX. The median effect size was [larger/smaller/similar] for Revised protocols (ES = .XX) compared to RP:P protocols (ES = .XX), and [larger/smaller/similar] compared to the original studies (ES = .XX) and [larger/smaller/similar] compared to the original RP:P replications (ES = .XX). Overall, Revised protocols produced [much larger/somewhat larger/similar] effect sizes compared to RP:P protocols (ES = .XX). We also elicited peer beliefs about the replications through prediction markets and surveys of a group of researchers in psychology. The peer researchers predicted that the Revised protocols would [decrease/not affect/increase] the replication rate, [consistent with/not consistent with] the observed replication results. The results suggest that the lack of replicability of these findings observed in RP:P was [partly/completely/not] due to discrepancies in the RP:P protocols that could be resolved with expert peer review.

show abstract

Robustness of the adult statistical word segmentation literature: Part 1

Hartshorne¹,

Skorb²,

Dietz³

et al. 2017

Preprint

View full text Add to dashboard Cite

We report the first set of results in a multi-year project to replicate every adult statistical word segmentation study. We reported replications of six experiments. The purpose of these replications is both to assess the strength of the findings in the statistical learning literature but also to provide more accurate effect size estimates. In every instance, we were able to replicate successful learning. However, many theoretically important modulations of that learning failedto replicate. Moreover, learning success was generally much lower than in the original studies. In the General Discussion, we consider whether these differences are due to differences in subject populations, low power in the original studies, or some other factor. Regardless, these initial results suggest taking caution in relying on the originally reported findings.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lauren Skorb

Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability

The Meta-Science of Adult Statistical Word Segmentation: Part 1

Many Labs 5: Replication of van Dijk, van Kleef, Steinel, and van Beest (2008)

Many Labs 5: Testing pre-data collection peer review as an intervention to increase replicability

Robustness of the adult statistical word segmentation literature: Part 1

Contact Info

Product

Resources

About