Zentall (2008) challenges Arantes and Grace's (2008) failure to replicate Clement, Feltus, Kaiser, and Zentall (2000) Learning & Behavior 2008, 36 (1), 26-28 doi: 10.3758/LB.36.1.26 J. Arantes, joana.arantes@canterbury.ac.nz; R. C. Grace, randolph.grace@canterbury.ac.nz Arantes and Grace (2008) report two experiments that attempted to (1) replicate the work ethic effect reported by Clement, Feltus, Kaiser, and Zentall (2000); (2) determine whether the results were similar for simultaneous and successive discrimination training; and (3) analyze the effects of amount of training on the preference for the high-effort stimulus. However, both experiments failed to replicate Clement et al.'s results. Instead of a consistent preference for the high-effort stimuli, we found that for both S and S test trials, preference for the FR20 stimulus depended on the initiating event that preceded the choice stimuli. Specifically, pigeons preferred the FR20 stimulus on trials preceded by FR1 or by no response requirement and preferred the FR1 stimulus on trials preceded by FR20. We also found that preferences on test trials became more extreme as amount of training increased and were generally stronger on S than on S trials.In his commentary, Zentall (2008) argues that our failures to replicate Clement et al. (2000) should not be taken as evidence that the work ethic effect is unreliable. He suggests that our results may be due to methodological shortcomings such as insufficient training or subjects' experimental histories, and that they are actually consistent with those of Clement et al. when examined closely. However, we do not find these arguments convincing.We are puzzled by Zentall's comment that we trained the pigeons to criterion and then tested them in Experiment 1, and consequently failed to replicate the work ethic effect because of the "minimal amount of training provided" (p. 20). This is incorrect; our pigeons received more training than did those of Clement et al. (2000). In Experiment 1, our pigeons completed an average of 37.2 baseline sessions prior to test. Given that the average number of sessions to reach criterion in the simultaneous condition was 2.1, our pigeons had an average of 35.1 sessions of overtraining prior to test, as compared with Clement et al.'s, which had a total of 23.0 sessions (3.0 sessions to reach criterion plus 20.0 sessions of overtraining). Overall, the training our pigeons received was comparable to the amount that Singer, Berry, and Zentall (2007) found was necessary for a reliable within-contrast effect to emerge. Thus, it is unlikely that our failure to replicate Clement et al. in Experiment 1 was due to insufficient training.Regarding our Experiment 2, Zentall claims that the preference for the stimuli that followed the greater response requirement in baseline trials increased as more training was provided. He calculates the overall average preferences in the S and S trials for the subgroup of 4 pigeons that received extended training as 56% and 65%, respectively, concludin...