Foraging animals must use decision-making strategies that dynamically account for uncertainty in the world. To cope with this uncertainty, animals have developed strikingly convergent strategies that use information about multiple past choices and reward to learn representations of the current state of the world. However, the underlying learning rules that drive the required learning have remained unclear. Here, working in the relatively simple nervous system of Drosophila, we combine behavioral measurements, mathematical modeling, and neural circuit perturbations to show that dynamic foraging depends on a learning rule incorporating reward expectation. Using a novel olfactory dynamic foraging task, we characterize the behavioral strategies used by individual flies when faced with unpredictable rewards and show, for the first time, that they perform operant matching. We build on past theoretical work and demonstrate that this strategy requires the existence of a covariance-based learning rule in the mushroom body - a hub for learning in the fly. In particular, the behavioral consequences of optogenetic perturbation experiments suggest that this learning rule incorporates reward expectation. Our results identify a key element of the algorithm underlying dynamic foraging in flies and suggest a comprehensive mechanism that could be fundamental to these behaviors across species.