In this paper, a novel scheme is proposed for FPGA implementation of a wavepipelined filter using Distributed Arithmetic Algorithm (DAA). To make the circuit independent of fabrication variations in the parameters, a sub-optimal wavepipelined scheme is proposed for the various combinational blocks of the DA filter. A self tuning FSM is in-built to choose the clock skew and clock period for 110 registers between the wavepipelined blocks. To test the efficacy of the scheme proposed, three filters with 4, 8 and 10 taps respectively are implemented using DAA approach on Xilinx Spartan II XC25100-5PQ208 device. The filters are implemented using three schemes: synchronous pipelining, sub-optimal wavepipelining and no pipelining (i.e., using neither synchronous pipelining nor wavepipelining). From the implementation results, it is observed that wavepipelined DA filters are faster by a factor of 1.31-1.61 compared to non-pipelined DA filters. The synchronous pipelined DA filters are in turn faster by a factor of 1.73-2.06 compared to the wavepipelined DA filters. The increased speeds are achieved by increasing the number of slices by 25%-33%, the number of registers by 350-530% and power dissipation by 107-167%. The delay-register product of the wavepipelined DA filters are reduced by a factor of 2.64-3.06 compared to the pipelined DA filters. The technique proposed in this paper is also applicable for ASICs and FPGAs from other vendors.