Background
In medical education, particularly in biochemistry, crafting high-quality assessment questions is a primary challenge. Each item necessitates thorough evaluation, and precise identification of student abilities is crucial for maximally reflecting learning achievement.
Objective
This study aims to enhance assessment quality in biochemistry medical education by implementing Item Response Theory (IRT). This approach addresses Classical Test Theory (CTT) limitations. Recognizing the critical role of question quality in the learning process, the study investigates how IRT can more holistically and equitably assess student abilities. It includes a comparative analysis of student scores before and after IRT implementation.
Methods
Employing a mixed-method research approach, this study combines comparative quantitative analysis with qualitative ICC curve analysis in a pre-post experimental design. It focuses on biochemistry exam data from medical students (n = 89). IRT is used to measure the probability of student responses to questions, using parameters such as discrimination, difficulty level, and guessing probability. Jamovi software supports this analysis by accelerating computational processes.
Results
Significant improvements were observed in both question quality and student scores. Prior to IRT implementation, the average initial exam score was 56.1, which increased to 74.1 in the subsequent exam. The IRT evaluation indicated that the exam questions achieved a more effective differentiation between students of varying abilities. This improvement was evident from the increased person reliability and through Wright Map visualizations, which helped identify highly difficult questions via the Item Characteristic Curve (ICC).
Conclusion
The study advocates for integrating IRT as a standard method in biochemistry medical assessments. It highlights the necessity of assessments that are sensitive to individual student capabilities, providing more precise feedback for enhancing the quality of learning. These findings are crucial for evolving evaluation methodologies and advancing medical education standards.