“…These assessments are usually made through quantitative model selection methods, which penalize models based on either their a-priori flexibility (e.g., Kass & Raftery, 1995;Myung et al, 2006;Annis et al, 2019;Gronau et al, 2017;Schwarz, 1978) or their over-fitting to the noise in samples of data (e.g., Spiegelhalter et al, 2002;Vehtari et al, 2017;Browne, 2000;Akaike, 1974). Importantly, models that are more flexible a-priori will have an unfair advantage in accurately explaining the data than simpler models (Roberts & Pashler, 2000;Myung & Pitt, 1997;Evans, Howard, et al, 2017), and models that over-fit to a sample of data will predict future data more poorly than those that only capture the robust trends (Myung, 2000). Although model comparison is less similar to confirmatory experimental research than model application, model comparison still typically involves confirmatory research questions about which models will be superior to others, making it well suited to preregistration.…”