“…Second, if action were required, what sort of system qualification would then be both justifiable, in terms of cost effectiveness, and practically available?As noted above, there are few studies available, which report levels of agreement between teachers or expert raters on open-ended items. Interestingly, the consensus estimates for scoring reliability in both the Swedish and Norwegian national reading test end up close to .73 (c.f Tengberg & Skar, 2016),. although both item construction and the structure of scoring guidelines differ substantially between the two tests.…”