Internationally, standard observational measures of Early Childhood Education and Care (ECEC) are used to assess the quality of provision. They are applied as research tools but, significantly, also guide policy decisions, distribution of resources and public opinion. Considerable faith is placed in such measures, yet their validity, reliability and functioning within context should all be considered in interpreting the findings they generate. We examine the case of the Classroom Assessment Scoring System (CLASS) in the Australian study, Effective Early Education for Children (E4Kids). Using this measure Australian educators were identified as “low quality” in provision of instruction (average 2.1 on a scale of 1–7). When these results became public, they attracted negative press coverage and the potential for harm. We interrogate these findings asking three questions relating to sampling, contextual and empirical evidence that define quality and measurement strategies. We conclude that measurement problems, most notably a floor effect, is the most likely explanation for uniformly low CLASS-Instructional scores among Australian ECEC educators, and indeed across international studies. Using a theoretically and empirically informed rescaling strategy we show that there is a diversity of instructional quality across Australian ECEC, and that rescaling might more effectively guide improvement strategies to target those of lowest quality. Beyond, our findings call for a more critical approach in interpretation of standard measures of ECEC quality and their applications in policy and practice, internationally.