This work presents a way to increase the throughput and energy efficiency of finite impulse response (FIR) filters through the efficient application of retiming and two-level pipelining. It is a challenge to increase the filter's throughput and energy efficiency while reducing latency and hardware complexity. The operations of addition and multiplication are divided using two-level pipelining. The break addition procedure is retimed. The architecture of m-tap filter (4-tap, 8-tap, 16-tap, 32-tap, and 64-tap) with n-bit input word length (4-bit, 8-bit, 16-bit, and 32-bit) Pipelined Retiming delay generation Filters (PRF), were designed. The proposed distributed arithmetic-based FIR Filter with pipelining has produced the least delay of 2.564ns for 4-tap with 8-bit input, and the maximum delay of 56.040ns for 64-tap with 32-bit word length. The proposed distributed arithmetic-based FIR Filter with retiming method has produced the least delay of 0.687ns for 4-tap with 8-bit input, and the maximum delay of 4.535ns for 64-tap with 32-bit word length. When compared with the pipelining method, the delay has been decreased by 73.20% for 4-tap with 8-bit input and 91.90% for 64-tap with 32-bit word length.