Conventional methods solve the full‐waveform inversion making use of gradient‐based algorithms to minimize an error function, which commonly measure the Euclidean distance between observed and predicted waveforms. This deterministic approach only provides a ‘best‐fitting’ model and cannot account for the uncertainties affecting the predicted solution. Local methods are also usually prone to get trapped into local minima of the error function. On the other hand, casting this inverse problem into a probabilistic framework has to deal with the formidable computational effort of the Bayesian approach when applied to non‐linear problems with expensive forward evaluations and large model spaces. We present a gradient‐based Markov Chain Monte Carlo full‐waveform inversion in which the posterior sampling is accelerated by compressing the data and model spaces through the discrete cosine transform, and by also defining a proposal that is a local, Gaussian approximation of the target posterior probability density. This proposal is constructed using the local Hessian and gradient informations of the log posterior, which are made computationally manageable thanks to the compression of the data and model spaces. We demonstrate the applicability of the approach by performing two synthetic inversion tests on portions of the Marmousi and BP acoustic model. In these examples, the forward modelling is performed using Devito, a finite difference domain‐specific language that solves the discretized wave equation on a Cartesian grid. For both examples, the results obtained by the implemented method are also validated against those obtained using a classic deterministic approach. Our tests illustrate the efficiency of the proposed probabilistic method, which seems quite robust against cycle‐skipping issues and also characterized by a computational cost comparable to that of the local inversion. The outcomes of the proposed probabilistic inversion can also play the role of starting models for a subsequent local inversion step aimed at improving the spatial resolution of the probabilistic result, which was limited by the model compression.