The widely used wavelet-thresholding techniques (DWT-H and DWT-S) have a near-optimal behavior that cannot be enhanced by any local denoising filter, but they cannot utilize the similarity of small-size image patches to enhance the denoising performance. Two of the latest improvements (WNLM and NLMW) introduced the Euclidean distance to measure the similarity of image patches, and then used the non-local meaning of similar patches for further denoising. Since the Euclidean distance is not a good similarity measurement, these two improvements are limited. In this study, we introduced the earth mover’s distance (EMD) as the similarity measure of small-scale patches within the wavelet sub-bands of noisy images. Moreover, at higher noise levels, we further incorporated joint bilateral filtering, which can filter both the spatial domain and the intensity domain of images. Denoising simulation experiments on BSDS500 demonstrated that our algorithm outperformed the DWT-H, DWT-S, WNLM, and NLMW algorithms by 4.197 dB, 3.326 dB, 2.097 dB, and 1.162 dB in terms of the average PSNR, and by 0.230, 0.213, 0.132, and 0.085 in terms of the average SSIM.