In everyday life, people often have to switch back and forth between different environments that come with different problems and volatilities. While volatile environments require fast learning (i.e., high learning rate), stable environments call for lower learning rates. Previous studies have shown that people can adapt their learning rate while remaining in an environment, but it remains unclear whether they can also learn to associate different learning rates to different environments, and instantaneously retrieve their environment-specific learning rate settings when revisiting these environments. Here, we hypothesised that people can switch back and forth between two different environments and learn to use different, optimal learning rates for each. Second, we tested whether people would also show an echo of this learning rate difference in a test phase where both environments had the same volatility. We used optimality simulations and Bayesian hierarchical modelling to demonstrate, across three experiments (total N = 273), that people can alternate between two different learning rates, on a trial-by-trial basis, when switching back and forth between two two-armed bandit tasks in two different environments (i.e., casinos) that differ in volatility. Results from the test phase further suggest that participants learned to attribute these different learning rates to their respective environments, as there remained a small difference in learning rate and choice reversal around the first reward contingency switch. We conclude that people can flexibly adapt and learn to associate different learning rates to different environments, on a trial-by-trial basis, offering important insights for developing theories of meta-learning and context-specific control.