Do we preferentially learn from outcomes that confirm our choices? This is one of the most basic, and yet consequence-bearing, questions concerning reinforcement learning. In recent years, we investigated this question in a series of studies implementing increasingly complex behavioral protocols. The learning rates fitted in experiments featuring partial or complete feedback, as well as free and forced choices, were systematically found to be consistent with a choice-confirmation bias. This result is robust across a broad range of outcome contingencies and response modalities. One of the prominent behavioral consequences of the confirmatory learning rate pattern is choice hysteresis: that is the tendency of repeating previous choices, despite contradictory evidence. As robust and replicable as they have proven to be, these findings were (legitimately) challenged by a couple of studies pointing out that a choice-confirmatory pattern of learning rates may spuriously arise from not taking into consideration an explicit choice autocorrelation term in the model. In the present study, we re-analyze data from four previously published papers (in total nine experiments; N=363), originally included in the studies demonstrating (or criticizing) the choice-confirmation bias in human participants. We fitted two models: one featured valence-specific updates (i.e., different learning rates for confirmatory and disconfirmatory outcomes) and one additionally including an explicit choice autocorrelation process (gradual perseveration). Our analysis confirms that the inclusion of the gradual perseveration process in the model significantly reduces the estimated choice-confirmation bias. However, in all considered experiments, the choice-confirmation bias remains present at the meta-analytical level, and significantly different from zero in most experiments. Our results demonstrate that the choice-confirmation bias resists the inclusion of an explicit choice autocorrelation term, thus proving to be a robust feature of human reinforcement learning. We conclude by discussing the psychological plausibility of the gradual perseveration process in the context of these behavioral paradigms and by pointing to additional computational processes that may play an important role in estimating and interpreting the computational biases under scrutiny.