“…However, the cycle cost of serial multipliers are much higher than fullword multipliers. And thus, many designs [12,13,14,15,16,17,18,19] propose high-radix methods that scan multiple bits synchronically to reduce the cycle cost. Furthermore, to deal with the time-consuming carry chain of partial product accumulation, carry-save structures are adopted [14,18] to optimize the latency.…”