This paper presents a novel computational optimization of the deceived non local means filter using moving average and symmetric weighting. The proposed optimization is compared with different approaches that reduce the computational cost of the deceived non local means filter. Furthermore, the impact of parallelizing different optimization approaches is assessed by evaluating the execution time and scalability in Xeon Phi KNL architecture. The proposed optimization for the sequential implementation achieved a 90x speedup, while its parallelized implementation yielded a speedup of up to 1662x.