Metacognitive biases have been repeatedly associated with transdiagnostic psychiatric dimensions of ‘anxious-depression’ and ‘compulsivity and intrusive thought’, cross-sectionally. To progress our understanding of the underlying neurocognitive mechanisms, new methods are required to measure metacognition remotely, within individuals over time. We developed a gamified smartphone task designed to measure visuo-perceptual metacognitive (confidence) bias and investigated its psychometric properties across two studies (N = 3410 unpaid citizen scientists, N = 52 paid participants). We assessed convergent validity, split-half and test–retest reliability, and identified the minimum number of trials required to capture its clinical correlates. Convergent validity of metacognitive bias was moderate (r(50) = 0.64, p < 0.001) and it demonstrated excellent split-half reliability (r(50) = 0.91, p < 0.001). Anxious-depression was associated with decreased confidence (β = − 0.23, SE = 0.02, p < 0.001), while compulsivity and intrusive thought was associated with greater confidence (β = 0.07, SE = 0.02, p < 0.001). The associations between metacognitive biases and transdiagnostic psychiatry dimensions are evident in as few as 40 trials. Metacognitive biases in decision-making are stable within and across sessions, exhibiting very high test–retest reliability for the 100-trial (ICC = 0.86, N = 110) and 40-trial (ICC = 0.86, N = 120) versions of Meta Mind. Hybrid ‘self-report cognition’ tasks may be one way to bridge the recently discussed reliability gap in computational psychiatry.