“…At the group level, classifier performance was tested by comparing the group-level distribution of classification accuracies to a chance-level distribution using Bayesian one sample t tests; Bayesian statistics were used given their robustness in case of small-to-moderate sample sizes and nonnormal distributions (Moore, Reise, Depaoli, & Haviland, 2015) and because, with these analyses, the bias toward accepting or rejecting the null hypothesis does not change with sample size. Furthermore, Bayesian statistics assess evidence for a model under investigation in the light of the data, whereas group-level classical t tests make population-level inferences; population-level inferences using a classical t test have been shown to be problematic when comparing classification accuracies against chancelevel (Allefeld, Gorgen, & Haynes, 2016). A Bayesian factor (BF 10 ) greater than 3 was considered as providing moderate evidence in favor of above-level classification accuracy, a BF 10 greater than 10 as providing strong evidence, a BF 10 greater than 30 as providing very strong evidence, and BF 10 greater than 100 as providing decisive evidence (Lee & Wagenmakers, 2013).…”