A New Statistic for Detection of Aberrant Answer Changes

Johnson

2019

Brit J Math & Statis

Self Cite

According to Wollack and Schoenig (2018, The Sage encyclopedia of educational research, measurement, and evaluation. Thousand Oaks, CA: Sage, 260), benefiting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect examinees who may have benefited from item preknowledge for the case when the set of compromised items is known. The asymptotic distribution of the new statistic under no preknowledge is proved to be a simple mixture of two χ2 distributions. We perform a detailed simulation study to show that the Type I error rate of the new statistic is very close to the nominal level and that the power of the new statistic is satisfactory in comparison to that of the existing statistics for detecting item preknowledge based on both item scores and response times. We also include a real data example to demonstrate the usefulness of the suggested statistic.

“…The use of

Λ_{ST}

Section: Method: a New Statistic Based On Item Scores And Response Timesmentioning

confidence: 99%

The use of item scores and response times to detect examinees who may have benefited from item preknowledge

Johnson

2019

Brit J Math & Statis

Self Cite

“…Erasure analysis was also performed at the individual level using the L-index (Sinharay, Duong, & Wood, 2017). The values of the L-index agree with the values of EDI g , EDI g N , and EDI g A for the data set.…”

Section: Resultsmentioning

confidence: 99%

“…There are several limitations of this article and, consequently, several related topics can be further investigated. First, it is possible to extend other indices for detection of fraudulent erasures for individual examinees including those suggested by Sinharay and Johnson (2017), Sinharay et al (2017), and van der Linden and Lewis (2015) to the group level and a future study may compare the extensions suggested in this article to extensions of other individual-level statistics for detecting fraudulent erasures. Second, while our simulation study was detailed, it is possible to perform more simulations, possibly with other IRT models.…”

Section: Discussionmentioning

confidence: 99%

“…To address the increasing interest in practice on erasure analysis, there has been an upswing in research on the topic. Recently, researchers such as Belov (2015), Sinharay et al (2017), Sinharay and Johnson (2017), van der Linden and Jeon (2012), van der Linden and Lewis (2015), Wollack et al (2015), and Wollack and Eckerly (2017) presented new statistics for individual-level or group-level erasure analysis. Sinharay et al (2017) performed a comprehensive comparison of several of these statistics at the individual level—they found the EDI (Wollack et al, 2015) and their suggested statistic L-index, which is based on the likelihood ratio statistic, to have performed the best.…”

Section: Introductionmentioning

confidence: 99%

“…However, erasures essentially mean answer changes (ACs), and computer-based tests (CBTs) may also suffer from fraudulent ACs. Tiemann and Kingston (2014) and Sinharay, Duong, and Wood (2017) provided examples of CBTs in which ACs are allowed-fraudulent ACs can definitely occur for such tests. Wollack, Cohen, and Eckerly (2015) suggested the erasure detection index (EDI) to detect fraudulent erasures for individual examinees.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Detecting Fraudulent Erasures at an Aggregate Level

Journal of Educational and Behavioral Statistics

2017

Self Cite

Wollack, Cohen, and Eckerly suggested the “erasure detection index” (EDI) to detect fraudulent erasures for individual examinees. Wollack and Eckerly extended the EDI to detect fraudulent erasures at the group level. The EDI at the group level was found to be slightly conservative. This article suggests two modifications of the EDI for the group level. The asymptotic null distribution of the two modified indices is proved to be the standard normal distribution. In a simulation study, the modified indices are shown to have Type I error rates close to the nominal level and larger power than the index of Wollack and Eckerly. A real data example is also included.

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items

Jensen

2018

Psychometrika

In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge, etc. Traditional frequentist approaches that are used in such problems include the Wald test, the likelihood ratio test, and the score test (e.g., Fischer, Appl Psychol Meas 27:3-26, 2003; Finkelman, Weiss, & Kim-Kang, Appl Psychol Meas 34:238-254, 2010; Glas & Dagohoy, Psychometrika 72:159-180, 2007; Guo & Drasgow, Int J Sel Assess 18:351-364, 2010; Klauer & Rettig, Br J Math Stat Psychol 43:193-206, 1990; Sinharay, J Educ Behav Stat 42:46-68, 2017). This paper shows that approaches based on higher-order asymptotics (e.g., Barndorff-Nielsen & Cox, Inference and asymptotics. Springer, London, 1994; Ghosh, Higher order asymptotics. Institute of Mathematical Statistics, Hayward, 1994) can also be used to test for the equality of the examinee ability over two sets of items. The modified signed likelihood ratio test (e.g., Barndorff-Nielsen, Biometrika 73:307-322, 1986) and the Lugannani-Rice approximation (Lugannani & Rice, Adv Appl Prob 12:475-490, 1980), both of which are based on higher-order asymptotics, are shown to provide some improvement over the traditional frequentist approaches in three simulations. Two real data examples are also provided.