The sex hormone estradiol has recently gained attention in human decision-making research. Animal studies have already shown that estradiol promotes dopaminergic transmission and thus supports reward-seeking behavior and aspects of addiction. In humans, natural variations of estradiol across the menstrual cycle modulate the ability to learn from direct performance feedback (“model-free” learning). However, it remains unclear whether estradiol also influences more complex “model-based” contributions to reinforcement learning. Here, 41 women were tested twice – in the low and high estradiol state of the follicular phase of their menstrual cycle – with a Two-Step decision task designed to separate model-free from model-based learning. The results showed that in the high estradiol state women relied more heavily on model-free learning, and accomplished reduced performance gains, particularly during the more volatile periods of the task that demanded increased learning effort. In contrast, model-based control remained unaltered by the influence of hormonal state across the group. Yet, when accounting for individual differences in the genetic proxy of the COMT-Val158Met polymorphism (rs4680), we observed that only the participants homozygote for the methionine allele (n = 12; with putatively higher prefrontal dopamine) experienced a decline in model-based control when facing volatile reward probabilities. This group also showed the increase in suboptimal model-free control, while the carriers of the valine allele remained unaffected by the rise in endogenous estradiol. Taken together, these preliminary findings suggest that endogenous estradiol may affect the balance between model-based and model-free control, and particularly so in women with a high prefrontal baseline dopamine capacity and in situations of increased environmental volatility.