Thirion. An empirical comparison of surface-based and volume-based group studies in neuroimaging. NeuroImage, Elsevier, 2012, 63 (3) Being able to detect reliably functional activity in a population of subjects is crucial in human brain mapping, both for the understanding of cognitive functions in normal subjects and for the analysis of patient data. The usual approach proceeds by normalizing brain volumes to a common three-dimensional template. However, a large part of the data acquired in fMRI aims at localizing cortical activity, and methods working on the cortical surface may provide better inter-subject registration than the standard procedures that process the data in the volume. Nevertheless, few assessments of the performance of surface-based (2D) versus volume-based (3D) procedures have been shown so far, mostly because inter-subject cortical surface maps are not easily obtained. In this paper we present a systematic comparison of 2D versus 3D group-level inference procedures, by using cluster-level and voxel-level statistics assessed by permutation, in random eects (RFX) and mixed-eects analyses (MFX). We consider dierent schemes to perform meaningful comparisons between thresholded statistical maps in the volume and on the cortical surface. We nd that surface-based multi-subject statistical analyses are generally more sensitive than their volume-based counterpart, in the sense that they detect slightly denser networks of regions when performing peak-level detection; this eect is less clear for cluster-level inference and is reduced by smoothing. Surface-based inference also increases the reliability of the activation maps.