“…Our proposed architecture enables variable block size (4×4, 8×8, 16×16 and 32×32), while the references [18,[20][21] only support one smaller block size (8×8 or 16×16). The references [18,[20][21] do not utilize on-chip DSP blocks. Due to the use of on-chip DSPs, our proposed work results in much shorter critical path and significant improvement in terms of frequency and throughput.…”