Accurately judging the quality of one’s estimates is a critical psychological process that varies across individuals, fluctuates over time, and is highly domain specific. While there are many methods for assessing judgment calibration during basic sensory tasks and tests of general knowledge, the field lacks a rigorous, controlled tool for assessing this form of metacognitive monitoring during category learning or goal-directed attention. This is especially important given the applicability of category learning and goal-directed attention to both high-level functioning and psychiatric disease. We therefore developed and tested a tool that pairs an established category learning paradigm with a standard metacognitive report, and validated the tool through longitudinal testing. A total of 80 adult subjects completed three sequential sessions online and answered a set of personality and psychiatric trait questionnaires. Test-retest results suggested that our tool provides fair reliability in assessing judgment calibration. We found aloofness and introversion are associated with improved metacognitive resolution. Furthermore, the best calibrated subjects were more likely to report narrowing their attention to a subset of goal-directed stimulus features. Our results suggested specific ways to improve the tools retest stability, while validating its use for longitudinal assessment of individual differences in judgment calibration during category learning.