Shared gradients are widely used to protect the private information of training data in distributed machine learning systems. However, Deep Leakage from Gradients (DLG) research has found that private training data can be recovered from shared gradients. The DLG method still has some issues such as the “Exploding Gradient,” low attack success rate, and low fidelity of recovered data. In this study, a Wasserstein DLG method, named WDLG, is proposed; the theoretical analysis shows that under the premise that the output layer of the model has a “bias” term, predicting the “label” of the data by whether the “bias” is “negative” or not is independent of the approximation of the shared gradient, and thus, the label of the data can be recovered with 100% accuracy. In the proposed method, the Wasserstein distance is used to calculate the error loss between the shared gradient and the virtual gradient, which improves model training stability, solves the “Exploding Gradient” phenomenon, and improves the fidelity of the recovered data. Moreover, a large learning rate strategy is designed to improve model training convergence speed in-depth. Finally, the WDLG method is validated on datasets from MNIST, Fashion MNIST, SVHN, CIFAR-100, and LFW. Experiments results show that the proposed WDLG method provides more stable updates for virtual data, a higher attack success rate, faster model convergence, higher image fidelity during recovery, and support for designing large learning rate strategies.