In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge, etc. Traditional frequentist approaches that are used in such problems include the Wald test, the likelihood ratio test, and the score test (e.g., Fischer, Appl Psychol Meas 27:3-26, 2003; Finkelman, Weiss, & Kim-Kang, Appl Psychol Meas 34:238-254, 2010; Glas & Dagohoy, Psychometrika 72:159-180, 2007; Guo & Drasgow, Int J Sel Assess 18:351-364, 2010; Klauer & Rettig, Br J Math Stat Psychol 43:193-206, 1990; Sinharay, J Educ Behav Stat 42:46-68, 2017). This paper shows that approaches based on higher-order asymptotics (e.g., Barndorff-Nielsen & Cox, Inference and asymptotics. Springer, London, 1994; Ghosh, Higher order asymptotics. Institute of Mathematical Statistics, Hayward, 1994) can also be used to test for the equality of the examinee ability over two sets of items. The modified signed likelihood ratio test (e.g., Barndorff-Nielsen, Biometrika 73:307-322, 1986) and the Lugannani-Rice approximation (Lugannani & Rice, Adv Appl Prob 12:475-490, 1980), both of which are based on higher-order asymptotics, are shown to provide some improvement over the traditional frequentist approaches in three simulations. Two real data examples are also provided.