Internal representations are thought to support the generation of flexible, long-timescale behavioral patterns in both animals and artificial agents. Here, we present a novel conceptual framework for how Drosophila use their internal representation of head direction to maintain preferred headings in their surroundings, and how they learn to modify these preferences in the presence of selective thermal reinforcement. To develop the framework, we analyzed flies’ behavior in a classical operant visual learning paradigm and found that they use stochastically generated fixations and directed turns to express their heading preferences. Symmetries in the visual scene used in the paradigm allowed us to expose how flies’ probabilistic behavior in this setting is tethered to their head direction representation. We describe how flies’ ability to quickly adapt their behavior to the rules of their environment may rest on a behavioral policy whose parameters are flexible but whose form is genetically encoded in the structure of their circuits. Many of the mechanisms we outline may also be relevant for rapidly adaptive behavior driven by internal representations in other animals, including mammals.