The main reason for the long time and high energy requirements of state-of-the-art Video Coding (VC) standards, such as the HEVC, is the large amount of distortion calculations. Among the most known and used ones is the Sum of Squared Differences (SSD) which has a strong correlation with the Peak Signal-to-Noise Ratio (PSNR). Such correlation is explored by current encoders to provide a good trade-off between rate and distortion. Once VC is mandatory in current battery-powered devices, the adopted distortion metric must be as energy-efficient as possible. Although simple, the SSD requires a square operation, which hardware realization is costly. Thus, some VC hardware designs replace the SSD by the Sum of Absolute Differences (SAD). However, using SAD instead of SSD pays a price in coding efficiency. In this work we investigate four hardware designs for the square operation. Synthesis results for the designed architectures are compared to a reference SAD design from the literature. The best SSD architecture, using clock gating, requires only 20% more energy than SAD.