2021
DOI: 10.31234/osf.io/pkjth
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Reflections on Analytical Choices in the Scaling Model for Test Scores in International Large-Scale Assessment Studies

Abstract: International large-scale assessments (LSAs) such as the Programme for International Student Assessment (PISA) provide important information about the distribution of student proficiencies across a wide range of countries. The repeated assessments of these content domains offer policymakers important information for evaluating educational reforms and received considerable attention from the media. Furthermore, the analytical strategies employed in LSAs often define methodological standards for applied research… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
36
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2

Relationship

5
0

Authors

Journals

citations
Cited by 12 publications
(36 citation statements)
references
References 94 publications
0
36
0
Order By: Relevance
“…We believe that the call for controlling for test-taking behavior in the reporting in large-scale assessment studies such as response propensity [3] using models that also include response times [87,88] poses a threat to validity because results can be simply manipulated by instructing students to omit items they do not know [20]. Notably, missing item responses are mostly omissions for CR items.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…We believe that the call for controlling for test-taking behavior in the reporting in large-scale assessment studies such as response propensity [3] using models that also include response times [87,88] poses a threat to validity because results can be simply manipulated by instructing students to omit items they do not know [20]. Notably, missing item responses are mostly omissions for CR items.…”
Section: Discussionmentioning
confidence: 99%
“…In the literature, it is frequently argued that missing item responses should never be scored as incorrect [3,7,11,27]. However, we think that the arguments against the incorrect scoring are flawed, and simulation studies cannot show the inadequacy of the UW model (see [19][20][21]).…”
Section: Scoring Missing Item Responses As Wrongmentioning
confidence: 99%
See 2 more Smart Citations
“…Probably in the largest part of the literature, DIF effects are considered as fixed (e.g., Kopf et al 2015b). In this case, the condition for balanced DIF replaces the expected value by the mean associated with the fixed item parameters (Robitzsch and Lüdtke 2021a). There is no additional uncertainty introduced in the estimation of group differences with fixed DIF effects because the item parameters are held fixed in repeated sampling.…”
Section: Differential Item Functioningmentioning
confidence: 99%