Interleaving is frequently used in digital communication and storage systems to improve the performance of forward error correcting codes. For turbo codes, an interleaver is an integral component and its proper design is crucial for good performance. Quadratic permutation polynomial (QPP) interleaver is a contention-free interleaver which is suitable for parallel turbo decoder implementation. This paper proposes a novel interleaver design, a variant of QPP interleaver, for turbo codes, which permutes a sequence of bits with the same statistical distribution as a conventional QPP interleaver and performs as well as or better than the conventional QPP. Proposed architecture has been simulated and synthesized using Xilinx and HDL Designer tools. Very large scale integration architecture for the proposed interleaver has been presented and analyzed for trade-off in terms of area, delay and power dissipation. Thermal power dissipation and device utilization have been computed for the proposed design using QuartusII (32-bit) tool. The paper also presents a comparison between the proposed variant of QPP interleaver and the conventional QPP interleaver.