Enhancing roadway safety is a priority of transportation. Hence, Artificial Intelligence (AI)-powered crash anticipation is receiving growing attention, which aims to assist drivers and Automated Driving Systems (ADSs) in avoiding crashes. To gain the trust from ADS users, it is essential to benchmark the performance of AI models against humans. This paper establishes a gaze databased method with the measures and metrics for evaluating human drivers' ability to anticipate crashes. A laboratory experiment is designed and performed, wherein a screen-based eye tracker collects the gaze data of six volunteers when they are watching 100 videos that include both normal and risky driving scenes. Statistical analyses (at the 0.05 significance level) of experimental data show that on average drivers can anticipate a crash up to 2.61±0.100 seconds before it occurs. The chance whereby drivers have anticipated crashes before they occur, termed humans' recall, is 0.928±0.0206. An AI model achieving the same recall value can anticipate crashes 2.22 seconds earlier than drivers on average, and the anticipation precision is 0.959. The study finds that crash involved traffic agents in driving scenes can vary drivers' instant attention level, average attention level, and spatial attention distribution. This finding supports AI models that learn a dynamic spatial-temporal attention mechanism for strengthening their ability to anticipate crashes. Results from the comparison suggests benefits of keeping human-in-the-loop, including further refining AI