In recent years, visual facial forgery has reached a level of sophistication that humans cannot identify fraud, which poses a significant threat to information security. A wide range of malicious applications have emerged, such as deepfake, fake news, defamation or blackmailing of celebrities, impersonation of politicians in political warfare, and the spreading of rumours to attract views. As a result, a rich body of visual forensic techniques has been proposed in an attempt to stop this dangerous trend. However, there is no comprehensive, fair, and unified performance evaluation to enlighten the community on best performing methods. The authors present a systematic benchmark beyond traditional surveys that provides in‐depth insights into facial forgery and facial forensics, grounding on robustness tests such as contrast, brightness, noise, resolution, missing information, and compression. The authors also provide a practical guideline of the benchmarking results, to determine the characteristics of the methods that serve as a comparative reference in this never‐ending war between measures and countermeasures. The authors’ source code is open to the public.