The performance of full waveform inversion (FWI) in constructing high-resolution subsurface models is closely related to the design of mismatch functions. The least-squares norm ([Formula: see text]) is commonly used, however prone to local minima when high-quality initial guess and low-frequency data are unavailable. The Wasserstein-1 metric ([Formula: see text]) captures time shifts more effectively, however may be plagued by imprecise deep structures. The Fourier metric leverages power spectrca from modeled and observed data, offering higher resolution updates near solutions. In this paper, we propose a progressive waveform inversion method called FWI-WF by utilizing [Formula: see text] and Fourier metrics. Specifically, in the early stage of inversion, we apply greater weight to the [Formula: see text] metric for constructing a good background model and avoiding falling into local minima. Then, the Fourier metric gradually dominates to refine edges and deep structures, providing high-resolution inversion results. During the optimization process, we employ automatic differentiation to improve inversion efficiency. Experimental results on three baseline geologic models indicate that FWI-WF outperforms three state-of-the-art methods.