To enable the low-cost design of 5 G IoT Standard reduced capability (NR-REDCAP) devices, hardware-software trade-offs must be made for various signal processing baseband kernels. Dedicated hardware for a kernel provides higher speed and power efficiency but limits the device’s programmability. With the varying range of user equipment (UE) deployment scenarios and dynamic wireless channel conditions, flexible solutions like digital signal processors (DSPs) are favorable for implementing channel estimation, channel equalization, and waveform modulation/demodulation algorithms. Due to stringent requirements on latency for algorithms like decimation, synchronization, and decoding, designers might favor dedicated hardware over DSP-based solutions. Such dedicated hardware increases the device cost as it needs to be added to the modem design solely to implement such specific algorithms. In this work, we study the most critical operation mode of synchronization for the NR-REDCAP standard,i.e., during the Handover between cells. Whereas for the enhanced mobile broadband (eMBB) 5 G NR standard, dedicated hardware might be the best implementation choice for decimation and synchronization; in contrast, for NR-REDCAP, a cost saving can be achieved by implementing and optimizing the kernels onto the vector DSP. We propose an architecture-aware methodology for implementing the most compute-intensive sub-kernels on our vector DSP. Furthermore, we perform structural optimizations to find the most effective sub-kernel variant in performance optimization. After algorithmic and structural optimizations, our results show that the synchronization procedure can be accommodated on a vector DSP with a clock frequency of 500 MHz.