This paper addresses the problem of using accuracy index values based on the squared difference between participant scores and true scores, the D 2 index, at the practical level. It clarifies ambiguity existing in the literature regarding the use of these index values to evaluate the scoring accuracy of human raters (evaluators). The paper critically investigates the effect of frame-of-reference (FOR) training on improving the accuracy of third-party evaluators' scores for organisations, such as those going through the Malcolm Baldrige National Quality Award (MBNQA) self-assessment exercise. It discusses a case study where 90 individual participants took part. The scores of these participants were recorded before training was given to them (no training) and after receiving FOR training. The study showed that providing FOR training has an effect on improving the elevation accuracy index (p < 0.05) in five of the seven categories used in this exercise. An observed leniency effect was also reduced. However, no improvement in the DA was observed. Thus, the evaluators' ability to assign an accurate overall score was improved, while the ability to discriminate between relative strengths and weaknesses did not show improvement. This implies evaluator training, particularly for heterogeneous pools of volunteers like those of corporate and state and local quality awards, should include more content on the performance dimensions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.