An experimental science relies on solid and replicable results. The last few years have seen a rich discussion on the reliability and validity of psychological science and whether our experimental findings can falsify our existing theoretical models. Yet, concerns have also arisen that this movement may impede new theoretical developments. In this article, we reanalyze the data from a crowdsourced replication project that concluded that lab site did not matter as predictor for Stroop performance, and, therefore, that there were no "hidden moderators" (i.e., context was likely to matter little in predicting the outcome of the Stroop task). The authors challenge this conclusion via a new analytical method-supervised machine learning -that "allows the data to speak." The authors apply this approach to the results from a Stroop task to illustrate the utility of machine learning and to propose moderators for future (confirmatory) testing. The authors discuss differences with some conclusions of the original article, which variables need to be controlled for in future inhibitory control tasks, and why psychologists can use machine learning to find surprising, yet solid, results in their own data.
What Predicts Stroop Performance? A Conditional Random Forest ApproachAn experimental science relies on solid and replicable results. The last few years have seen a rich discussion on the reliability and validity of psychological science and whether our experimental findings can falsify our existing theoretical models (Coyne, 2016). Arguably, one of the most important events calling into question the solidity of psychological science was the publication on precognition by Daryl Bem (2011) followed by a paper on data contingent analyses by Simmons, Nelson, and Simonsohn (2011). Following these papers, the times, they have been a-changing: Extensive discussions have been held on power, replicability, pre-registration, and context sensitivity (Bakker, Hartgertink, Wicherts, & Van der Maas, in press;Brandt, IJzerman et al., 2014, Van Bavel, Mende-Siedlecki, Brady, & Reinero, 2016IJzerman et al., 2016). Though Bem's (2011) paper was perhaps the most prominent, problems seem to be widespread, and these problems are often blamed on the fact that novelty has been championed over solid and replicable findings (Coyne, 2016;Nosek, Spies, & Motyl, 2012). Yet, this begs the question: How can we retain creative and innovative exploration in service of finding out "truths" about human functioning? Currently, the most popular way to do so is to conduct an experiment, explore the data, write a paper, replicate this study, and, if it fails, update one's theoretical assumptions (and perhaps argue on social media for a while). This approach is incredibly sensitive to post hoc interpretation and is inefficient. Indeed, there are near infinite reasons why effects replicate or not and failures to replicate often do little to inform theoretical predictions (but, see Brandt, IJzerman, et al., 2014). A number of recent articles were publishe...