2016
DOI: 10.1016/j.neuroimage.2016.07.040
|View full text |Cite
|
Sign up to set email alerts
|

Valid population inference for information-based imaging: From the second-level t -test to prevalence inference

Abstract: In multivariate pattern analysis of neuroimaging data, 'second-level' inference is often performed by entering classification accuracies into a t-test vs chance level across subjects. We argue that while the random-effects analysis implemented by the t-test does provide population inference if applied to activation differences, it fails to do so in the case of classification accuracy or other 'information-like' measures, because the true value of such measures can never be below chance level. This constraint c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
199
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 158 publications
(201 citation statements)
references
References 60 publications
2
199
0
Order By: Relevance
“…Although testing against chance is common, one caveat when using t-statistics on k-fold classification accuracy is that this does not allow population-level inference, in fact producing fixed-effects rather than random-effects results (see ref. 19 for details). The implication is that one cannot formally draw population-level inferences based on such analyses, restricting conclusions to the sample that was tested.…”
Section: Methodsmentioning
confidence: 99%
“…Although testing against chance is common, one caveat when using t-statistics on k-fold classification accuracy is that this does not allow population-level inference, in fact producing fixed-effects rather than random-effects results (see ref. 19 for details). The implication is that one cannot formally draw population-level inferences based on such analyses, restricting conclusions to the sample that was tested.…”
Section: Methodsmentioning
confidence: 99%
“…We also employed a permutation test (with permuting the classification labels within each participant 1000 times; one-tailed) to check if the mean accuracy was significantly greater than the chance-level. See the reference 43 for advanced issues pertaining to population-level inferences in MVPA studies.…”
Section: Methodsmentioning
confidence: 99%
“…At the group level, classifier performance was tested by comparing the group-level distribution of classification accuracies to a chance-level distribution using Bayesian one sample t tests; Bayesian statistics were used given their robustness in case of small-to-moderate sample sizes and nonnormal distributions (Moore, Reise, Depaoli, & Haviland, 2015) and because, with these analyses, the bias toward accepting or rejecting the null hypothesis does not change with sample size. Furthermore, Bayesian statistics assess evidence for a model under investigation in the light of the data, whereas group-level classical t tests make population-level inferences; population-level inferences using a classical t test have been shown to be problematic when comparing classification accuracies against chancelevel (Allefeld, Gorgen, & Haynes, 2016). A Bayesian factor (BF 10 ) greater than 3 was considered as providing moderate evidence in favor of above-level classification accuracy, a BF 10 greater than 10 as providing strong evidence, a BF 10 greater than 30 as providing very strong evidence, and BF 10 greater than 100 as providing decisive evidence (Lee & Wagenmakers, 2013).…”
Section: Multivariate Analysesmentioning
confidence: 99%
“…Significance of searchlight classifications were also assessed at both individual and group levels, given that analyses of group-level classification accuracies can indicate small above-chance level classifications as being significant while at an individual level, only few (if any) participants may show significant classification accuracies. It is therefore important to also consider the prevalence of the effect across participants and not only the mean classification rate of the group (Allefeld et al, 2016). To obtain significance values for individual-level classification accuracies, we used binomial tests indicating the classification accuracy threshold at which voxels are significant at p < .05 according to a binomial distribution (Noirhomme et al, 2014).…”
Section: Multivariate Analysesmentioning
confidence: 99%