“…In the literature, subexpression elimination has been used [19], [20] for applications with multiple constant multiplication arithmetic operations such as FIR filter implementations. Furthermore, distributed arithmetic [21] has been also proposed for optimized DSP implementations where constants are involved [22], [23]. Beyond DSP applications, Navarro et al [24] proposes a way to accelerate the arithmetic reduction for the case of summations in a GPU core, by exploiting distributed arithmetic.…”