The purpose of this study was to serve as a stepping stone toward closer agreement among judges of student writing at the point of admission to college by revealing common causes of disagreement. It was expected and found that more than half the variability in grades of a large number of judges on the same set of papers was due to “error” (random variation) or the idiosyncratic preferences of individual readers. In the variability that was not random or idiosyncratic, it was expected and found that there was a substantial core of common agreement on the general merit of the papers and that a small number of “schools of thought” would account for most of the systematic differences in grading standards. Factor analysis of correlations among the grades of 53 distinguished readers, representing six different fields, on 300 papers written by college freshmen of widely varying ability revealed just five such “schools of thought,” emphasizing:
Ideas: relevance, clarity, quantity, development, persuasiveness;
Form: organization and analysis;
Flavor: style, interest, sincerity;
Mechanics: specific errors in grammar, punctuation, etc.;
Wording: choice and arrangement of words.
No standards or criteria for judging the papers were suggested to the readers. Instead, they were told to use “whatever hunches, intuitions, or preferences you normally use in deciding that one paper is better than another.” They sorted the papers into nine piles in order of “general merit.” The only restriction was that all nine piles must be used, with not less than 4% of the papers in any pile.The five reader‐factors or “schools of thought” were identified by a “blind” classification of 11,018 comments written on 3,557 papers that were graded high (7–8–9) or low (1–2–3) by the three highest and three lowest readers on each factor. The person who classified the comments did not know the standing of any reader on any factor. In addition to the reader factors, three College Board tests taken by these students formed a separate “test‐factor” that had practically zero correlations with all reader‐factors except Mechanics (.50) and Wording (.45).It was not the purpose of this study to achieve a high degree of unanimity among the readers but to reveal the differences of opinion that prevail in uncontrolled grading–both in the academic community and in the educated public. To that end, the readers included college English teachers, social scientists, natural scientists, writers and editors, lawyers, and business executives. None the less, it was disturbing to find that 94% of the papers received either seven, eight, or nine of the nine possible grades; that no paper received less than five different grades; and that the median correlation between readers was 31. Readers in each field, however, agreed slightly better with the English teachers than with one another.