Efficient decision-making requires two key processes: learning values from actions and identifying a set of relevant actions to learn from in a given context. While dopamine (DA) is a well-known substrate for signaling reward prediction errors (RPEs) from selected actions to adjust behavior, the process of establishing and switching between action representations is still poorly understood. To address this gap, we used fiber photometry and computational modelling in a three-armed bandit task where mice learned to seek rewards delivered through three successive rule sets, displaying distinct strategies in each rule. We show that DA dynamically reflected RPEs computed from different task features, revealing context-specific internal representations. Our findings demonstrate that mice not only learned and updated action values but also action representations, adapting the features from which they learn across rules for flexible adjustment of their decision strategy.