This article examines the interdependency of two context effects that are known to occur regularly in large-scale assessments: item position effects and effects of test-taking effort on the probability of correctly answering an item. A microlongitudinal design was used to measure test-taking effort over the course of a large-scale assessment of 60 min. Two components of test-taking effort were investigated: initial effort and change in effort. Both components of test-taking effort significantly affected the probability to solve an item. In addition, it was found that participants' current test-taking effort diminished considerably across the course of the test. Furthermore, a substantial linear position effect was found, which indicated that item difficulty increased during the test. This position effect varied considerably across persons. Concerning the interplay of position effects and test-taking effort, it was found that only the change in effort moderates the position effect and that persons differ with respect to this moderation effect. The consequences of these results concerning the reliability and validity of large-scale assessments are discussed.
Item position effects can seriously bias analyses in educational measurement, especially when multiple matrix sampling designs are deployed. In such designs, item position effects may easily occur if not explicitly controlled for. Still, in practice it usually turns out to be rather difficultor even impossible-to completely control for effects due to the position of items. The objectives of this article are to show how item position effects can be modeled using the linear logistic test model with additional error term (LLTM + e) in the framework of generalized linear mixed models (GLMMs), to explore in a simulation study how well the LLTM + e holds the nominal Type I risk threshold, to conduct power analysis for this model, and to examine the sensitivity of the LLTM + e to designs that are not completely balanced concerning item position. Overall, the LLTM + e proved suitable for modeling item position effects when a balanced design is used. With decreasing balance, the model tends to be more conservative in the sense that true item position effects are more unlikely to be detected. Implications for linking and equating procedures which use common items are discussed. Keywords item position effects, generalized linear mixed models, linear logistic test model, balanced incomplete block designIn educational assessment and in large-scale assessments in particular, item response theory (IRT) models are often applied to estimate the difficulty of items as well as the ability of examinees. Linking and equating procedures are used to define common scales, which allow achievement scores of two nonequivalent samples to be compared even if the two samples share only a subset of common items. The idea is to use a subset of so-called anchor items that are common to both samples to define a common scale. The corresponding design is known as the commonitem nonequivalent groups equating design (
Background: In order to measure the proficiency of person populations in various domains, large-scale assessments often use marginal maximum likelihood IRT models where person proficiency is modelled as a random variable. Thus, the model does not provide proficiency estimates for any single person. A popular approach to derive these proficiency estimates is the multiple imputation of plausible values (PV) to enable subsequent analyses on complete data sets. The main drawback is that all variables that are to be analyzed later have to be included in the imputation model to allow the distribution of plausible values to be conditional on these variables. These background variables (e.g., sex, age) have to be fully observed which is highly unlikely in practice. In several current large-scale assessment programs missing observations on background variables are dummy coded, and subsequently, dummy codes are used additionally in the PV imputation model. However, this approach is only appropriate for small proportions of missing data. Otherwise the resulting population scores may be biased. Methods: Alternatively, single imputation or multiple imputation methods can be used to account for missing values on background variables. With both imputation methods, the result is a two-step procedure in which the PV imputation is nested within the background variable imputation. In the single+multiple-imputation (SMI), each missing value on background variables is replaced by one value. In the multiple+multiple-imputation (MMI), each missing value is replaced by a set of imputed values. MMI is expected to outperform SMI as SMI ignores the uncertainty due to missing values in the background data. Results: In a simulation study, both methods yielded unbiased population estimates under most conditions. Still, the recovery proportion was slightly higher for the MMI method. Conclusions: The advantages of the MMI method are apparent for fairly high proportions of missing values in combination with fairly high dependency between the latent trait and the probability of missing data on background variables.
Diese Arbeit untersucht, ob sich Lehrkräfte anhand der von ihnen besuchten Fortbildungsveranstaltungen zu Gruppen ordnen lassen, die distinkte inhaltliche Fortbildungsschwerpunkte abbilden. Darüber hinaus wird der Frage nachgegangen, ob sich Lehrkräfte dieser Gruppen systematisch unterscheiden. Grundlage dieser Untersuchung bildete eine Lehrerbefragung, die im Rahmen des IQB-Ländervergleichs Sprachen im Jahr 2009 stattfand und an der 2 076 Lehrkräfte der Fächer Deutsch und Englisch teilnahmen. Es konnten fünf Gruppen bestimmt werden, die sich in den inhaltlichen Schwerpunkten der Fortbildungen und der Teilnahmeaktivität unterscheiden. Besonders bedeutsam waren unter diesen Gruppen diejenigen mit der höchsten und geringsten Fortbildungsbeteiligung. Lehrkräfte mit Fortbildungen in allen Themenbereichen waren selbstwirksamer und kooperierten stärker in ihrem Fachkollegium. Im Gegensatz dazu zeigten Lehrkräfte ohne Fortbildungsbeteiligung ein entsprechend reziprokes Muster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.