Artificial intelligence can now synthesise face images which people cannot distinguish from real faces. Here, we investigated the wisdom of the (outer) crowd (averaging individuals' responses to the same trial) and inner crowd (averaging the same individual's responses to the same trial after completing the test twice) as routes to increased performance. In Experiment 1, participants viewed synthetic and real faces, and rated whether they thought each face was synthetic or real using a 1–7 scale. Each participant completed the task twice. Inner crowds showed little benefit over individual responses, and we found no associations between performance and personality factors. However, we found increases in performance with increasing sizes of outer crowd. In Experiment 2, participants judged each face only once, providing a binary ‘synthetic/real’ response, along with a confidence rating and an estimate of the percentage of other participants that they thought agreed with their answer. We compared three methods of aggregation for outer crowd decisions, finding that the majority vote provided the best performance for small crowds. However, the ‘surprisingly popular’ solution outperformed the majority vote and the confidence‐weighted approach for larger crowds. Taken together, we demonstrate the use of outer crowds as a robust method of improvement during synthetic face detection, comparable with previous approaches based on training interventions.