Full Waveform Inversion (FWI) is a procedure used to determine the elastic parameters of the Earth by reducing the misfit between observed elastodynamic wavefields and their numerically modeled counterparts. The numerical solution of the elastodynamic wave equation is computationally expensive and its performance is typically bandwidth bound. Computing the gradient of the FWI misfit functional adds further complexity as it involves computing the zero-lag cross-correlation of two wavefields propagating in opposite temporal directions. In this paper, we utilize graphics processing units (GPUs) for their high memory bandwidth and combine two principal optimizations in order to compute FWI gradients on large models and for long simulation times. Wavefield reconstruction methods allow efficient gradient computations with minimal memory requirements and interconnection transfers. Time-space tiling techniques permit us to transcend the limited amount of GPU memory while avoiding dramatic slowdowns due to the low interconnection bandwidth. The implementation considers a task-oriented, hybrid usage of explicitly managed and Unified Memory in order to satisfy the requirements. Benchmarks demonstrate that the proposed approach is able to preserve 78 − 90% of the original performance, when oversubscribing the amount of physical memory available on GPUs. Comparison with existing methods highlights the benefits of the method.