“…In the literature, there are some architectures developed to accelerate the DCT/IDCT (inverse DCT) computation for H.265/HEVC [6–20]. Some of them do not support all transform sizes [10–12, 14, 18, 20]. All the designs embed (or assume [10, 17]) the full‐size transposition buffer (32 × 32 samples) implemented either as a register matrix [6–9, 11, 16, 18] or memory modules [8, 13, 15, 19, 20].…”