We present a numerical study of training a self-propelling agent to migrate in the unsteady flow environment. We control the agent to utilize the background flow structure by adopting the reinforcement learning algorithm to minimize energy consumption. We considered the agent migrating in two types of flows: one is simple periodical double-gyre flow as a proof-of-concept example, while the other is complex turbulent Rayleigh–Bénard convection as a paradigm for migrating in the convective atmosphere or the ocean. The results show that the smart agent in both flows can learn to migrate from one position to another while utilizing background flow currents as much as possible to minimize the energy consumption, which is evident by comparing the smart agent with a naive agent that moves straight from the origin to the destination. In addition, we found that compared to the double-gyre flow, the flow field in the turbulent Rayleigh–Bénard convection exhibits more substantial fluctuations, and the training agent is more likely to explore different migration strategies; thus, the training process is more difficult to converge. Nevertheless, we can still identify an energy-efficient trajectory that corresponds to the strategy with the highest reward received by the agent. These results have important implications for many migration problems such as unmanned aerial vehicles flying in a turbulent convective environment, where planning energy-efficient trajectories are often involved.