“…The method was illustrated with empirical data stemming from PIAAC 2012. The reported results underline that the general five second-rule, which ignores item-specific collateral information, is most likely too simple (see also Goldhammer, Martens, & Lüdtke, 2017). It is too simple because the response time thresholds varied between items and were considerably higher RESPONSE TIME-BASED TREATMENT OF OMITTED RESPONSES 6 than five seconds (20 to 30 seconds on average).…”
Section: Response Time-based Treatment Of Omitted Responses In Computmentioning
A new response time-based method for coding omitted item responses in computer-based testing is introduced and illustrated with empirical data. The new method is derived from the theory of missing data problems of Rubin and colleagues and embedded in an item response theory framework. Its basic idea is using item response times to statistically test for each individual item whether omitted responses are missing completely at random (MCAR) or missing due to a lack of ability and thus not at random (MNAR) with fixed type-1 and type-2 error levels. If the MCAR hypothesis is maintained, omitted responses are coded as not administered (NA), and as incorrect (0) otherwise. The empirical illustration draws from the responses given by = 766 students to 70 items of a computer-based ICT-skills test. The new method is compared with the two common deterministic methods of scoring omitted responses as 0 or as NA. In result, response time thresholds from 18 to 58 seconds were identified. With 61 %, more omitted responses were recoded into 0 than into NA (39 %). The differences in difficulty were larger when the new method was compared to deterministically scoring omitted responses as NA compared to scoring omitted responses as 0. The variances and reliabilities obtained under the three methods showed small differences. The paper concludes with a discussion of the practical relevance of the observed effect sizes, and with recommendations for the practical use of the new method as a method to be applied in the early stage of data processing.
“…The method was illustrated with empirical data stemming from PIAAC 2012. The reported results underline that the general five second-rule, which ignores item-specific collateral information, is most likely too simple (see also Goldhammer, Martens, & Lüdtke, 2017). It is too simple because the response time thresholds varied between items and were considerably higher RESPONSE TIME-BASED TREATMENT OF OMITTED RESPONSES 6 than five seconds (20 to 30 seconds on average).…”
Section: Response Time-based Treatment Of Omitted Responses In Computmentioning
A new response time-based method for coding omitted item responses in computer-based testing is introduced and illustrated with empirical data. The new method is derived from the theory of missing data problems of Rubin and colleagues and embedded in an item response theory framework. Its basic idea is using item response times to statistically test for each individual item whether omitted responses are missing completely at random (MCAR) or missing due to a lack of ability and thus not at random (MNAR) with fixed type-1 and type-2 error levels. If the MCAR hypothesis is maintained, omitted responses are coded as not administered (NA), and as incorrect (0) otherwise. The empirical illustration draws from the responses given by = 766 students to 70 items of a computer-based ICT-skills test. The new method is compared with the two common deterministic methods of scoring omitted responses as 0 or as NA. In result, response time thresholds from 18 to 58 seconds were identified. With 61 %, more omitted responses were recoded into 0 than into NA (39 %). The differences in difficulty were larger when the new method was compared to deterministically scoring omitted responses as NA compared to scoring omitted responses as 0. The variances and reliabilities obtained under the three methods showed small differences. The paper concludes with a discussion of the practical relevance of the observed effect sizes, and with recommendations for the practical use of the new method as a method to be applied in the early stage of data processing.
“…As such, threshold sizes for traditional items may differ than those for innovative items. For example, Goldhammer et al (2016Goldhammer et al ( , 2017 reported finding thresholds that ranged between 3 and 76 s for the Programme for the International Assessment of Adult Competencies (PIAAC) problem solving items. Kong et al (2007) Therefore, researchers may face difficulties in applying these methods to innovative item types or items that have innovative features.…”
Section: Setting a Threshold For Detecting Rapid-guessingmentioning
When disengaged examinees spend too little time reading and considering the content of an item but still respond to the item, rapid-guessing, their responses may not be representative of their ability. Rapid-guessing has been found to distort item parameters and the estimation of examinees' performance on cognitive tests (Schnipke and Scrams 1997; Wise 2015; Wise and DeMars 2006; Wise and Kingsbury 2016) and can distort the validity of the inferences made based on scores. Therefore, detecting rapid-guessing
“…For instance, one can assume the rate of correct responses to be around chance level for item completions that are classified as disengaged (Goldhammer, Martens, & Lüdtke, 2017;Lee & Jia, 2014). An experimental strategy could be based on the assumption that high-stakes testing causes a lower rate of disengaged completions than a low-stakes condition.…”
Section: Eliciting Process Data and Synthesizing It To Process Indicamentioning
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.