The international rise of large-scale evaluations of quality in education in recent decades has been accompanied, ironically, by a major neglect of the issue of quality itself. This is mainly because of fundamental shortcomings in the design of most large-scale evaluation programmes. In programmes like PISA, TALIS and SABER-Teachers, the central but intricate questions of quality become defined as questions of indexed quantity, thus deflecting the former to the margins, or out of the picture. Beginning with a review of the points just mentioned, this contribution then proceeds to identify and investigate some key inadequacies in the conceptions evidence employed by large-scale evaluation programmes. Deficiencies in their gathering of evidence are likewise examined. A comparative perspective is also included to reveal overlooked exclusions in the supposedly neutral PISA instruments. The manifold character of what an adequate research exploration of quality in educational experience would look like is then investigated. Finally, the case is made for advancing the kind of evaluation programme that includes an adequate understanding of quality in education and that does justice to this in its research design.