Quantum many-body control is a central milestone en route to harnessing quantum technologies. However, the exponential growth of the Hilbert space dimension with the number of qubits makes it challenging to classically simulate quantum many-body systems and consequently, to devise reliable and robust optimal control protocols. Here, we present a novel framework for efficiently controlling quantum many-body systems based on reinforcement learning (RL). We tackle the quantum control problem by leveraging matrix product states (i) for representing the many-body state and, (ii) as part of the trainable machine learning architecture for our RL agent. The framework is applied to prepare ground states of the quantum Ising chain, including critical states. It allows us to control systems far larger than neural-network-only architectures permit, while retaining the advantages of deep learning algorithms, such as generalizability and trainable robustness to noise. In particular, we demonstrate that RL agents are capable of finding universal controls, of learning how to optimally steer previously unseen many-body states, and of adapting control protocols on-the-fly when the quantum dynamics is subject to stochastic perturbations.