Computer-generated (CG) face images are common in video games, advertisements, and other media. CG faces vary in their degree of realism, a factor that impacts viewer reactions. Therefore, efficient control of visual realism of face images is important. Efficient control is enabled by a deep understanding of visual realism perception: the extent to which viewers judge an image as a real photograph rather than a CG image. Across two experiments, we explored the processes involved in visual realism perception of face images. In Experiment 1, participants made visual realism judgments on original face images, inverted face images, and images of faces that had the top and bottom halves misaligned. In Experiment 2, participants made visual realism judgments on original face images, scrambled faces, and images that showed different parts of faces. Our findings indicate that both holistic and piecemeal processing are involved in visual realism perception of faces, with holistic processing becoming more dominant when resolution is lower. Our results also suggest that shading information is more important than color for holistic processing, and that inversion makes visual realism judgments harder for realistic images but not for unrealistic images. Furthermore, we found that eyes are the most influential face part for visual realism, and face context is critical for evaluating realism of face parts. To the best of our knowledge, this work is a first realism-centric study attempting to bridge the human perception of visual realism on face images with general face perception tasks.