Full-waveform inversion (FWI) can be formulated as a nonlinear least-squares optimization problem. This nonconvex problem can be computationally expensive because it requires repeated solutions of the wave equation. Randomized subsampling techniques allow us to work with small subsets of (monochromatic) source experiments, reducing the computational cost. However, this subsampling may weaken subsurface illumination or introduce subsampling-related incoherent artifacts. These subsampling-related artifacts — in conjunction with the desire to obtain high-fidelity inversion results — motivate us to come up with a technique to regularize this inversion problem. Following earlier work, we have taken advantage of the fact that curvelets represent subsurface models and model perturbations parsimoniously. At first impulse, promoting sparsity on the model directly seemed the most natural way to proceed, but we have determined that in certain cases it can be advantageous to promote sparsity on the Gauss-Newton updates instead. Although constraining the one norm of the descent directions did not change the underlying FWI objective, the constrained model updates remained descent directions, removed subsampling-related artifacts, and improved the overall inversion result. We have empirically observed this phenomenon in situations where the different model updates occurred at roughly the same locations in the curvelet domain. We have further investigated and analyzed this behavior, in which nonlinear inversions benefit from sparsity-promoting constraints on the updates, by means of a set of carefully selected examples including the phase retrieval problem and time-harmonic FWI. In all cases, we have observed a faster decay of the residual and model error as a function of the number of iterations.