On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

Conijn, Judith M.; Emons, Wilco H. M.; Assen, Marcel A.L.M. van; Sijtsma, Klaas

doi:10.1080/00273171.2010.546733

Cited by 7 publications

(5 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In a further alternative hybrid approach of design-based inference and model-based inference (see Ståhl et al, 2016), subjects can additionally be weighted by including weights ν P,n according to their fit to a statistical model. For example, model-based student-specific weights ν P,n can be derived according to their fit to the scaling model (person fit; see Conijn et al, 2011;Hong & Cheng, 2019;Raiche et al, 2012;Schuster & Yuan, 2011). In such an approach, students whose item responses are atypical with respect to the IRT model (e.g., non-scalable students; see Haertel, 1989) would be downweighted compared to students whose item responses are consistent with the IRT model.…”

Section: Model-assisted Design-based Inference For Personsmentioning

confidence: 99%

“…Moreover, the ability variable θ could also be redefined in a scaling model in which item responses and response times load on θ, resulting in a purified latent variable for speed (Costa et al, 2021). Furthermore, measurement models could also involve an additional student latent variable α n that characterizes person fit (Conijn et al, 2011;Ferrando, 2019;Raiche et al, 2012):…”

Section: The Role Of Test-taking Behavior In the Scaling Modelmentioning

confidence: 99%

See 1 more Smart Citation

Some thoughts on analytical choices in the scaling model for test scores in international large-scale assessment studies

Robitzsch

Lüdtke

2022

Meas Instrum Soc Sci

View full text Add to dashboard Cite

International large-scale assessments (LSAs), such as the Programme for International Student Assessment (PISA), provide essential information about the distribution of student proficiencies across a wide range of countries. The repeated assessments of the distributions of these cognitive domains offer policymakers important information for evaluating educational reforms and received considerable attention from the media. Furthermore, the analytical strategies employed in LSAs often define methodological standards for applied researchers in the field. Hence, it is vital to critically reflect on the conceptual foundations of analytical choices in LSA studies. This article discusses the methodological challenges in selecting and specifying the scaling model used to obtain proficiency estimates from the individual student responses in LSA studies. We distinguish design-based inference from model-based inference. It is argued that for the official reporting of LSA results, design-based inference should be preferred because it allows for a clear definition of the target of inference (e.g., country mean achievement) and is less sensitive to specific modeling assumptions. More specifically, we discuss five analytical choices in the specification of the scaling model: (1) specification of the functional form of item response functions, (2) the treatment of local dependencies and multidimensionality, (3) the consideration of test-taking behavior for estimating student ability, and the role of country differential items functioning (DIF) for (4) cross-country comparisons and (5) trend estimation. This article’s primary goal is to stimulate discussion about recently implemented changes and suggested refinements of the scaling models in LSA studies.

show abstract

Section: Model-assisted Design-based Inference For Personsmentioning

confidence: 99%

Section: The Role Of Test-taking Behavior In the Scaling Modelmentioning

confidence: 99%

Some thoughts on analytical choices in the scaling model for test scores in international large-scale assessment studies

Robitzsch

Lüdtke

2022

Meas Instrum Soc Sci

View full text Add to dashboard Cite

show abstract

“…Moreover, the ability variable θ could also be redefined in a scaling model in which item responses and response times load on θ, resulting in a purified latent variable for speed (Costa et al, 2021). Furthermore, measurement models could also involve an additional student latent variable α that characterizes person fit (Conijn et al, 2011;Ferrando, 2019;Raiche et al, 2012)…”

Section: The Role Of Test-taking Behavior In the Scaling Modelmentioning

confidence: 99%

Reflections on Analytical Choices in the Scaling Model for Test Scores in International Large-Scale Assessment Studies

Robitzsch¹,

Lüdtke²

2021

Preprint

View full text Add to dashboard Cite

International large-scale assessments (LSAs) such as the Programme for International Student Assessment (PISA) provide important information about the distribution of student proficiencies across a wide range of countries. The repeated assessments of these content domains offer policymakers important information for evaluating educational reforms and received considerable attention from the media. Furthermore, the analytical strategies employed in LSAs often define methodological standards for applied researchers in the field. Hence, it is vital to critically reflect the conceptual foundations of analytical choices in LSA studies. This article discusses methodological challenges in selecting and specifying the scaling model used to obtain proficiency estimates from the individual student responses in LSA studies. We distinguish design-based inference from model-based inference. It is argued that for the official reporting of LSA results, design-based inference should be preferred because it allows for a clear definition of the target of inference (e.g., country mean achievement) and is less sensitive to specific modeling assumptions. More specifically, we discuss five analytical choices in the specification of the scaling model: (1) Specification of the functional form of item response functions, (2) the treatment of local dependencies and multidimensionality, (3) the consideration of test-taking behavior for estimating student ability, and the role of country differential items functioning (DIF) for (4) cross-country comparisons, and (5) trend estimation. This article's primary goal is to stimulate discussion about recently implemented changes and suggested refinements of the scaling models in LSA studies.

show abstract

“…Several tests of person fit are aimed at the detection of trait changes between subsets of items; see Klauer and Rettig (1990) and Glas and Dagohoy (2007) for example. The multilevel logistic regression approach of Reise (2000) also evaluates the constancy of θ over the test, but see Conijn, Emons, van Assen, and Sijtsma (2011) for a critical review of this approach.…”

Section: The Information Matrix Testmentioning

confidence: 99%

Assessing Person Fit With the Information Matrix Test

Ranger

Kuhn

2015

Methodology

View full text Add to dashboard Cite

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.

show abstract

On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

Cited by 7 publications

References 43 publications

Some thoughts on analytical choices in the scaling model for test scores in international large-scale assessment studies

Some thoughts on analytical choices in the scaling model for test scores in international large-scale assessment studies

Reflections on Analytical Choices in the Scaling Model for Test Scores in International Large-Scale Assessment Studies

Assessing Person Fit With the Information Matrix Test

Contact Info

Product

Resources

About