Many metrics exist for the evaluation of binary classifiers, all with their particular advantages and shortcomings. Recently an “Efficiency Index” for the evaluation of classifiers has been proposed, based on the consistency, or matching, and contradiction, or mismatching, of outcomes. This metric and its confidence intervals are easy to calculate from base data in a 2x2 contingency table, and values can be qualitatively and semi-quantitatively categorised. For medical tests, in which context the Efficiency Index was originally proposed, it facilitates communication of risk (of correct diagnosis versus misdiagnosis) to both clinicians and patients. Variants of the efficiency index (balanced, unbiased) which take into account disease prevalence and test cut-offs have also been described. The objectives of the current paper were firstly to extend the EI construct to other formulations (balanced level, quality) and secondly to explore the utility of EI and all four of its variants when applied to the dataset of a large prospective test accuracy study of a cognitive screening instrument. This showed that the balanced level, quality, and unbiased formulations of EI are more stringent measures.