Schemas are rich and complex knowledge structures about the typical unfolding of events in a context. For example, a schema of a lovely dinner at a restaurant. Schemas are central in psychology and neuroscience. Here, we suggest that reinforcement learning (RL), a computational theory of learning the structure of the world and relevant goal-oriented behavior, underlies schema learning. We synthesize literature about schemas and RL to offer that three RL principles might govern the learning of schemas: learning via prediction errors, constructing hierarchical knowledge using hierarchical RL, and dimensionality reduction through learning a simplified and abstract representation of the world. We then offer that the orbito-medial prefrontal cortex is involved in both schemas and RL due to its involvement in dimensionality reduction and in guiding memory reactivation through interactions with posterior brain regions. Finally, we hypothesize that the amount of dimensionality reduction might underlie gradients of involvement along the ventral-dorsal and posterior-anterior axes of the orbito-medial prefrontal cortex. More specific and detailed representations might engage the ventral and posterior parts, while abstraction might shift representations toward the dorsal and anterior parts of the medial prefrontal cortex.