BackgroundComputer based assessments of paediatrics in our institution use series of clinical cases, where information is progressively delivered to the students in a sequential order. Three types of formats are mainly used: Type A (single answer), Pick N, and Long-menu. Long-menu questions require a long, hidden list of possible answers: based on the student’s initial free text response, the program narrows the list, allowing the student to select the answer. This study analyses the psychometric properties of Long-menu questions compared with the two other commonly used formats: Type A and Pick N.MethodsWe reviewed the difficulty level and discrimination index of the items in the paediatric exams from 2009 to 2015, and compared the Long-menu questions with the Type A and Pick N questions, using multiple-way analyses of variances.ResultsOur dataset included 13 exam sessions with 855 students and 558 items included in the analysis, 212 (38 %) Long-menu, 201 (36 %) Pick N, and 140 Type A (25 %) items. There was a significant format effect associated with both level of difficulty (p = .005) and discrimination index (p < .001). Long-menu questions were easier than Type A questions(+5.2 %; 95 % CI 1.1–9.4 %), and more discriminative than both Type A (+0.07; 95 % CI 0.01–0.14), and Pick N (+0.10; 95 % CI 0.05–0.16) questions.ConclusionsLong-menu questions show good psychometric properties when compared with more common formats such as Type A or Pick N, though confirmatory studies are needed. They provide more variety, reduce the cueing effect, and thus may more closely reflect real life practice than the other item formats inherited from paper-based examination that are used during computer-based assessments.