The analysis of functional mapping experiments in positron emission tomography involves the formation of images displaying the values of a suitable statistic, summarising the evidence in the data for a particular effect at each voxel. These statistic images must then be scrutinised to locate regions showing statistically significant effects. The methods most commonly used are parametric, assuming a particular form of probability distribution for the voxel values in the statistic image. Scientific hypotheses, formulated in terms of parameters describing these distributions, are then tested on the basis of the assumptions. Images of statistics are usually considered as lattice representations of continuous random fields. These are more amenable to statistical analysis. There are various shortcomings associated with these methods of analysis. The many assumptions and approximations involved may not be true. The low numbers of subjects and scans, in typical experiments, lead to noisy statistic images with low degrees of freedom, which are not well approximated by continuous random fields. Thus, the methods are only approximately valid at best and are most suspect in single-subject studies. In contrast to the existing methods, we present a nonparametric approach to significance testing for statistic images from activation studies. Formal assumptions are replaced by a computationally expensive approach. In a simple rest-activation study, if there is really no activation effect, the labelling of the scans as "active" or "rest" is artificial, and a statistic image formed with some other labelling is as likely as the observed one. Thus, considering all possible relabellings, a p value can be computed for any suitable statistic describing the statistic image. Consideration of the maximal statistic leads to a simple nonparametric single-threshold test. This randomisation test relies only on minimal assumptions about the design of the experiment, is (almost) exact, with Type I error (almost) exactly that specified, and hence is always valid. The absence of distributional assumptions permits the consideration of a wide range of test statistics, for instance, "pseudo" t statistic images formed with smoothed variance images. The approach presented extends easily to other paradigms, permitting nonparametric analysis of most functional mapping experiments. When the assumptions of the parametric methods are true, these new nonparametric methods, at worst, provide for their validation. When the assumptions of the parametric methods are dubious, the nonparametric methods provide the only analysis that can be guaranteed valid and exact.
The Type I and II error properties of the t test were evaluated by means of a Monte Carlo study that sampled 8 real distribution shapes identified by Micceri (1986Micceri ( , 1989 as being representative of types encountered in psychology and education research. Results showed the independent-samples t tests to be reasonably robust to Type I error when (a) sample sizes are equal, (b) sample sizes are fairly large, and (c) tests are two-tailed rather than one-tailed. Nonrobust results were obtained primarily under distributions with extreme skew. The / test was robust to Type II error under these nonnormal distributions, but researchers should not overlook robust nonparametric competitors that are often more powerful than the t test when its underlying assumptions are violated.Along with Pearson's chi-squared test, the independent-samples t test must be counted among the best-known statistical procedures in current use. Given its familiarity and utility, it is not surprising that over the years, this test has received an inordinate amount of attention from statistical researchers. Much of this attention has focused on the question of robustness (or lack thereof) of the t statistic to departures from the underlying assumption of population normality.Although there is some disagreement on the subject (see Bradley, 1978), the prevailing view seems to be that the independent-samples t test is reasonably robust, insofar as Type I errors are concerned, to non-Gaussian population shape so long as (a) sample sizes are equal or nearly so, (b) sample sizes are fairly large (Boneau, 1960, mentions sample sizes of 25 to 30), and (c) tests are two-tailed rather than one-tailed. Note also that when these conditions are met and differences between nominal alpha and actual alpha do occur, discrepancies are usually of a conservative rather than of a liberal nature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.