Research in face recognition has continuously been challenged by extrinsic (head pose, lighting conditions) and intrinsic (facial expression, aging) sources of variability. While many survey papers on face recognition exist, in this paper, we focus on a comparative study of 3-D face recognition under expression variations. As a first contribution, 3-D face databases with expressions are listed, and the most important ones are briefly presented and their complexity is quantified using the iterative closest point (ICP) baseline recognition algorithm. This allows to rank the databases according to their inherent difficulty for face-recognition tasks. This analysis reveals that the FRGC v2 database can be considered as the most challenging because of its size, the presence of expressions and outliers, and the time lapse between the recordings. Therefore, we recommend to use this database as a reference database to evaluate (expression-invariant) 3-D face-recognition algorithms. We also determine and quantify the most important factors that influence the performance. It appears that performance decreases 1) with the degree of nonfrontal pose, 2) for certain expression types, 3) with the magnitude of the expressions, 4) with an increasing number of expressions, and 5) for a higher number of gallery subjects. Future 3-D face-recognition algorithms should be evaluated on the basis of all these factors. As the second contribution, a survey of published 3-D face-recognition methods that deal with expression variations is given. These methods are subdivided into three classes depending on the way the expressions are handled. Region-based methods use expression-stable regions only, while other methods model the expressions either using an isometric or a statistical model. Isometric models assume the deformation because of expression variation to be (locally) isometric, meaning that the deformation preserves lengths along the surface. Statistical models learn how the facial soft tissue deforms during expressions based on a training database with expression labels. Algorithmic performances are evaluated by the comparison of recognition rates for identification and verification. No statistical significant differences in class performance are found between any pair of classes.