“…While [l I] [ll optimized], [12], and [13] require 704,504,450, and 520 instructions respectively, giving it a performance gain of more than four times over existing architectures. The main contriiutions to this performance gain are the Butterfly, and 128-bit Add-Sub instructions, which constitute nearly 40% of Fast IDCT computation.…”