The Self Perceptual Matching Task (SPMT) is a widely used task to investigate the cognitive processes underlying the Self-Prioritization Effect (SPE), wherein performance is enhanced for self-associated stimuli compared to other-associated ones. Despite the wide use of SPMT, there is a lack of attention on its reliability assessment. This ignorance is concerning, given the prevalence of the reliability paradox in cognitive tasks: cognitive tasks demonstrate relatively low reliability when evaluating individual differences, though they produce robust experimen- tal effects. To fill this gap, this preregistered study investigated the reliability of SPMT using a multiverse approach, combining all possible indicators and baselines used to quantify SPE in SPMT. We examined the robustness and the reliability of 24 SPE measures across 17 datasets (N = 805). More specifically, we used a meta-analytical approach to estimate the robustness of SPE across datasets. We calculated the Split-Half Reliability (r) and Intraclass Correla- tion Coefficient (ICC2) for each SPE measure. Our findings revealed a robust experimental effect of SPE across datasets. However, when it came to individ- ual differences, SPE measures derived from Reaction Time (RT) and Efficiency exhibited relatively higher, compared to other SPE measures, but still unsatis- fied split-half reliability (approximately 0.6). Similarly, for the reliability across multiple time points, as assessed by ICC2, RT and Efficiency demonstrated low levels of test-retest reliability (close to 0.5). These findings uncovered the pres- ence of a reliability paradox in the context of SPMT-based SPE assessment. We discussed the implications of our findings for future studies.