IntroductionWith the increasing amount of research around Computational Thinking (CT) and endeavors introducing CT into curricula worldwide, assessing CT at all levels of formal education is of utmost importance to ensure that CT-related learning objectives are met. This has contributed to a progressive increase in the number of validated and reliable CT assessments for K-12, including primary school. Researchers and practitioners are thus required to choose among multiple instruments, often overlapping in their age validity.MethodsIn this study, we compare the psychometric properties of two of these instruments: the Beginners' CT test (BCTt), developed for grades 1–6, and the competent CT test (cCTt), validated for grades 3–4. Classical Test Theory and Item Response Theory (IRT) were employed on data acquired from 575 students in grades 3–4 to compare the properties of the two instruments and refine the limits of their validity.ResultsThe findings (i) establish the detailed psychometric properties of the BCTt in grades 3–4 for the first time, and (ii) through a comparison with students from the same country, indicate that the cCTt should be preferred for grades 3–4 as the cCTt is able to discriminate between students of low and medium ability. Conversely, while the BCTt, which is easier, shows a ceiling effect, it is better suited to discriminate between students in the low ability range. For these grades, the BCTt can thus be employed as a screening mechanism to identify low ability students.DiscussionIn addition to providing recomendations for use of these instruments, the findings highlight the importance of comparing the psychometric properties of existing assessments, so that researchers and practitioners, including teachers and policy makers involved in digital education curricular reforms, may take informed decisions when selecting assessments.