In this paper, we propose a software-based simulator and an optimized hardware implementation of the CORDIC algorithm. The number of iterations and the bit width are selected by their relationship calculated by the simulator to satisfy precision requirement. In the proposed hardware implementation, the addition and subtraction operations share the hardware resources. The two's complement is calculated in two steps: inverting and adding one. The adding one operation is integrated in the addition component of the iteration. In addition, a 3:2 compressor is used to reduce the number of adders, and a carry look-ahead adder is used to reduce the critical path. The synthesized results show that when compared with state-of-the-art designs, our method reduces the area by 53.79% and the delay by 58.97% while maintaining the same precision.