Speakers adjust their pronunciation to come to sound more similar to recently heard speech in a phenomenon called phonetic imitation. The extent to which speakers imitate is commonly measured using the AXB perception task, which relies on the judgements of listeners. Despite its popularity, very few studies using the AXB assessment have considered variation or reliability in the listeners’ performance. The current study applies a test-retest methodology focusing on the performance of listeners in the AXB assessment of imitation, which has not been considered explicitly before. Forty listeners completed the same AXB experiment twice, two to three weeks apart. The findings showed that both sessions reach the same overall conclusion: the listeners perceived the same overall amount of imitation in both sessions, which is taken to mean that the shadowers did imitate and that the AXB task is reliable at the group level. Furthermore, the findings show that listeners vary substantially in their performance in the AXB assessment of imitation, but that they are relatively consistent in this performance across sessions. This suggests that differences in AXB performance at least partly reflect differences in ability to perceive imitation, rather than simply random variation.