Many researchers use Wizard of Oz (WoZ) as an experimental technique, but there are methodological concerns over its use, and no comprehensive criteria on how to best employ it. We systematically review 54 WoZ experiments published in the primary HRI publication venues from 2001-2011. Using criteria proposed by Fraser and Gilbert (1991), Green et al. (2004), Steinfeld et al. (2009), and Kelley (1984), we analyzed how researchers conducted HRI WoZ experiments. Researchers mainly used WoZ for verbal (72.2%) and non-verbal (48.1%) processing. Most constrained wizard production (90.7%), but few constrained wizard recognition (11%). Few reported measuring wizard error (3.7%), and few reported pre-experiment wizard training (5.4%). Few reported using WoZ in an iterative manner (24.1%). Based on these results we propose new reporting guidelines to aid future research.