“…To our knowledge, the idea of using log data from an EAP to analyse exam items was first introduced by Neel's 1999 work, presented at the Annual Meeting of AERA (cited in Jung Kim, 2001). To date, exam logs have mostly been used for measuring and modelling exam‐takers' accuracy, speed, revisits and effort (Bezirhan et al, 2021; Klein Entink et al, 2008; Sharma et al, 2020; Wise, 2015; Wise & Gao, 2017); analysing answering and revising behaviour during exams (Costagliola et al, 2008; Pagni et al, 2017); examining and enhancing metacognitive regulation of strategy use and cognitive processing (Dodonova & Dodonov, 2012; Goldhammer et al, 2014; Papamitsiou & Economides, 2015; Thillmann et al, 2013); classifying exam‐takers towards testing services personalisation (Papamitsiou & Economides, 2017); validating the interpretations of test score (Engelhardt & Goldhammer, 2019; Kane & Mislevy, 2017; Kong et al, 2007; Padilla & Benítez, 2014; Toton & Maynes, 2019; van der Linden & Guo, 2008); understanding exam‐takers' performance (Greiff et al, 2016; Kupiainen et al, 2014; Papamitsiou et al, 2014, 2018; Papamitsiou & Economides, 2013, 2014); enhancing item selection in adaptive testing environment (van der Linden, 2008); analysing exam items (Costagliola et al, 2008; Jung Kim, 2001); detecting cheating (Cleophas et al, 2021; Costagliola et al, 2008); and identifying test‐taking strategies (Costagliola et al, 2008). Nonetheless, most of the previous work focused on time‐based behaviours and the interpretation of exam‐taker results; few of them examined the potential of using exam‐taker behaviours to validate or enrich the interpretation of the quality of exam items.…”