Recently, Shealy and Stout (1993) proposed a DIF detecting procedure SIBTEST, which is 1) IRT model based, 2) non‐parametric, 3) does not require IRF estimation, 4) provides a test of significance, and 5) estimates the amount of DIF. Current versions of SIBTEST can only be used for dichotomously scored items. However, in this paper an extension to handle polytomous items is developed. This paper presents: (1) a discussion of an appropriate definition of DIF for polytomously scored items, (2) a modified SIBTEST procedure for detecting DIF for polytomous items, and (3) the results of two simulation studies comparing the modified SIBTEST with the Mantel and SMD procedures, one study with data constrained by the Rasch‐like partial credit model (same discrimination across polytomous items), and the other study with data having distinctly discrimations across items. These simulation studies indicate that the methodology of including the studied item in matching subtest for controling impact induced (group ability differences existing) Type I error tends to yield Type‐I/Type II error inflation rates that are highly unacceptable when the equal discrimination condition is violated. These simulation studies provide compelling evidence that the modified SIBTEST procedure is much more robust with regard to controlling impact‐induced Type I error rate inflation than the other procedures.
Recently, Shealy and Stout (1993) proposed a DIF detecting procedure SIBTEST, which is 1) IRT model based, 2) non‐parametric, 3) does not require IRF estimation, 4) provides a test of significance, and 5) estimates the amount of DIF. Current versions of SIBTEST can only be used for dichotomously scored items. However, in this paper an extension to handle polytomous items is developed. This paper presents: (1) a discussion of an appropriate definition of DIF for polytomously scored items, (2) a modified SIBTEST procedure for detecting DIF for polytomous items, and (3) the results of two simulation studies comparing the modified SIBTEST with the Mantel and SMD procedures, one study with data constrained by the Rasch‐like partial credit model (same discrimination across polytomous items), and the other study with data having distinctly discrimations across items. These simulation studies indicate that the methodology of including the studied item in matching subtest for controling impact induced (group ability differences existing) Type I error tends to yield Type‐I/Type II error inflation rates that are highly unacceptable when the equal discrimination condition is violated. These simulation studies provide compelling evidence that the modified SIBTEST procedure is much more robust with regard to controlling impact‐induced Type I error rate inflation than the other procedures.
A literature review was conducted to determine the current state of knowledge concerning the effects of the computer administration of standardized educational and psychological tests on the psychometric properties of these instruments. Studies were grouped according to a number of factors relevant to the administration of tests by computer. Based on the studies reviewed, we arrived at the following conclusions: The rate at which test‐takers omit items in an automated test may differ from the rate at which they omit items in a conventional presentation. Scores on tests from automated versions of personality inventories such as the Minnesota Multiphasic Personality Inventory are lower than scores obtained in the conventional testing format. These differences may result in part from differing omit rates, as described above, but some of the differences may be caused by other factors. Scores from automated versions of speed tests are not likely to be comparable with scores from paper‐and‐pencil versions. The presentation of graphics in an automated test may have an effect on score equivalence. Such effects were obtained in studies using the Hidden Figures Test. However, in studies with three Armed Services Vocational Aptitude Battery (ASVAB) tests, effects were not found. Tests containing items based on reading passages can become more difficult when presented on a CRT. This was demonstrated in a single study with the ASVAB tests. The possibility of such asymmetric practice effects may make it wise to avoid conducting equating studies based on single‐group counterbalanced designs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.