Animals chain movements into long-lived motor strategies, exhibiting variability across scales that reflects the interplay between internal states and environmental cues. To reveal structure in such variability, we build Markov models of movement sequences that bridge across timescales and enable a quantitative comparison of behavioral phenotypes among individuals. Applied to larval zebrafish responding to diverse sensory cues, we uncover a hierarchy of long-lived motor strategies, dominated by changes in orientation distinguishing cruising versus wandering strategies. Environmental cues induce preferences along these modes at the population level: while fish cruise in the light, they wander in response to aversive stimuli, or in search for appetitive prey. As our method encodes the behavioral dynamics of each individual fish in the transitions among coarse-grained motor strategies, we use it to uncover a hierarchical structure in the phenotypic variability that reflects exploration–exploitation trade-offs. Across a wide range of sensory cues, a major source of variation among fish is driven by prior and/or immediate exposure to prey that induces exploitation phenotypes. A large degree of variability that is not explained by environmental cues unravels hidden states that override the sensory context to induce contrasting exploration–exploitation phenotypes. Altogether, by extracting the timescales of motor strategies deployed during navigation, our approach exposes structure among individuals and reveals internal states tuned by prior experience.