The continuous wavelet transform (CWT) has been used in radar-based vital signs detection to identify and to remove the motion artifacts from the received radar signals. Since the CWT algorithm is computationally heavy, the processing of this algorithm typically results in long processing time and complex hardware implementation. The algorithm in its standard form typically uses software processing tools and is unable to support high-performance data processing. The aim of this research is to design an optimized CWT algorithm architecture to implement it on Field Programmable Gate Array (FPGA) in order to identify the unwanted movement introduced in the retrieved vital signs signals. The optimization approaches in the new implementation structure are based on utilizing the frequency domain processing, optimizing the required number of operations and implementing parallel processing of independent operations. Our design achieves significant processing speed and logic utilization optimization. It is found that processing the algorithm using our proposed hardware architecture is 48 times faster than processing it using MATLAB. It also achieves an improvement of 58% in speed performance compared to alternative solutions reported in literature. Moreover, efficient resources utilization is achieved and reported. This advanced performance of the proposed design is due to consciously implementing comprehensive approaches of multiple optimization techniques that results in multidimensional improvements. As a result, our achieved design is suitable for utilization in high-performance data processing applications.INDEX TERMS Continuous wavelet transform, FPGA implementation, radar remote sensing, motion artifact rejection, random body movements, FFT-based CWT, parallel processing.