Developmental prosopagnosia (DP) is a cognitive condition characterized by a relatively selective impairment in face recognition. Currently, people are screened for DP via a single attempt at objective face-processing tests, usually all presented on the same day. However, several variables probably influence performance on these tests irrespective of actual ability, and the influence of repeat administration is also unknown. Here, we assess, for the first known time, the test–retest reliability of the Cambridge Face Memory Test (CFMT)—the leading task used worldwide to diagnose DP. This value was found to fall just below psychometric standards, and single-case analyses revealed further inconsistencies in performance that were not driven by testing location (online or in-person), nor the time-lapse between attempts. Later administration of an alternative version of the CFMT (the CFMT-Aus) was also found to be valuable in confirming borderline cases. Finally, we found that performance on the first 48 trials of the CFMT was equally as sensitive as the full 72-item score, suggesting that the instrument may be shortened for testing efficiency. We consider the implications of these findings for existing diagnostic protocols, concluding that two independent tasks of unfamiliar face memory should be completed on separate days.