Assessment for learning (AfL) is a major approach to educational assessment that relies heavily on pedagogical practices, such as involving students in assessment, making transparent objectives and criteria, and asking open-ended questions that provoke higher order thinking. In this perspective piece, I argue that without the possibility of opening classroom activities to systematic and rigorous inspection and evaluation, AfL fails to be assessment. AfL activities happen ephemerally in classrooms, leading to in-the-moment and on-the-fly interpretations and decisions about student learning. In these contexts, determination of the degree of error in those judgements does not happen. Because human performance is so variable and because the samples teachers use to make judgements are not robustly representative, there is considerable error in their judgements about student learning. Nonetheless, despite the difficulties seen in putting AfL into practice, they appear to be good classroom teaching practices. In contrast, assessment proper requires careful inspection of data so that alternative explanations can be evaluated, leading to a preference for the most valid and reliable interpretation of performance evidence. Psychometric methods not only quantify amounts or qualities of performance, but also evaluate the degree to which judges agree with each other, leading to confidence in the validity and reliability of insights. Consequently, because AfL activities lack the essential characteristics of paying attention to error and methods of minimising its impact on interpretations, I recommend we stop thinking of AfL as assessment, and instead position it as good teaching.