The levellised cost of energy of wave energy converters (WECs) is not competitive with fossil fuel-powered stations yet. To improve the feasibility of wave energy, it is necessary to develop effective control strategies that maximise energy absorption in mild sea states, whilst limiting motions in high waves. Due to their model-based nature, state-of-the-art control schemes struggle to deal with model uncertainties, adapt to changes in the system dynamics with time, and provide real-time centralised control for large arrays of WECs. Here, an alternative solution is introduced to address these challenges, applying deep reinforcement learning (DRL) to the control of WECs for the first time. A DRL agent is initialised from data collected in multiple sea states under linear model predictive control in a linear simulation environment. The agent outperforms model predictive control for high wave heights and periods, but suffers close to the resonant period of the WEC. The computational cost at deployment time of DRL is also much lower by diverting the computational effort from deployment time to training. This provides confidence in the application of DRL to large arrays of WECs, enabling economies of scale. Additionally, model-free reinforcement learning can autonomously adapt to changes in the system dynamics, enabling fault-tolerant control.