The weighted statistics (WEST) approach recommended by Howard, Best, and Nickels (2014) provides a simple, statistically rigorous method for comparing the size of a treatment effect that can be used for comparisons across conditions (e.g., trained vs. untrained items), time (e.g., improvement during baseline testing vs. pre-to post-treatment improvement), participants, or intervention method. The approach analyses the change in per cent correct between pre-and post-intervention for each item used in the study by subtracting the per cent correct between the two conditions, using item-to-item variability to assess statistical significance. The recommendations made by Howard and colleagues will undoubtedly strengthen the inferences that can be drawn from single-case intervention studies. However, I have some concerns with the statistical approach they suggest. While using difference scores is a convenient way to measure the size of an effect, alternative measures may better capture the goals of an intervention study. The choice of effect size can have serious consequences for the statistical inferences drawn from a study, particularly when the central question relies on the interpretation of an interaction. Furthermore, there are serious statistical problems with comparing differences in per cent correct using parametric statistics like t-tests and analyses of variance (ANOVAs) as suggested by the WEST approach (e.g., Jaeger, 2008). I believe that consideration of these statistical issues will further strengthen single-case design intervention studies.
Different measures of effect size can lead to different conclusionsConsider an intervention study in which two participants are given the same intervention and assessed pre-and post-intervention. Participant A improves from 30% to 50% accuracy and Participant B from 70% to 90%. According to the WEST approach the participants benefitted equally from the treatment, but, because the two participants started at different points at baseline, it is not clear that the 20% improvement benefits both the participants the same. Participant A's improvement may result in larger effect on the participant's ability to communicate independently than Participant B. Alternatively, the psychological notion of a learning curve assumes initially rapid progress that slows as the learner approaches maximum performance, making Participant B's improvement more impressive than Participant A. Intuitively it is clear that two participants who have the same change in accuracy have not necessarily benefitted from the intervention equally. Furthermore, different ways of thinking about the effectiveness of an intervention can lead to different conclusions about who benefited more from the treatment. The appropriate *