When faced with imminent danger, animals must rapidly take defensive actions to reach safety. Mice can react to innately threatening stimuli in less than 250 milliseconds [1] and, in simple environments, use spatial memory to quickly escape to shelter [2,3]. Natural habitats, however, often offer multiple routes to safety which animals must rapidly identify and choose from to maximize the chances of survival [4]. This is challenging because while rodents can learn to navigate complex mazes to obtain rewards [5,6], learning the value of different routes through trial-and-error during escape from threat would likely be deadly. Here we have investigated how mice learn to choose between different escape routes to shelter. By using environments with paths to shelter of varying length and geometry we find that mice prefer options that minimize both path distance and path angle relative to the shelter. This choice strategy is already present during the first threat encounter and after only ~10 minutes of exploration in a novel environment, indicating that route selection does not require experience of escaping. Instead, an innate heuristic is used to assign threat survival value to alternative paths after rapidly learning the spatial environment. This route selection process is flexible and allows quick adaptation to arenas with dynamic geometries. Computational modelling of different classes of reinforcement learning agents shows that the observed behavior can be replicated by model-based agents acting in an environment where the shelter location is rewarding during exploration. These results show that mice combine fast spatial learning with innate heuristics to choose escape routes with the highest survival value. They further suggest that integrating priors acquired through evolution with knowledge learned from experience supports adaptation to changing environments while minimizing the need for trial-and-error when the errors are very costly.