“…The fine-grained techniques aim at relaxing every operation, mainly on hardware (the register-transfer or transistor levels) for the sake of critical path delay reduction, such as (segmented) computational resources (e.g., adder and multiplier) [8,21,27,28] and least significant bits (LSBs) truncation [13,20]. These techniques are well suited for relatively small, simple systems like DSP circuits [7,13,20]. Contrarily, the coarse-grained techniques aim at reducing the amount of computations, such as task skipping [17,18], input sampling [22], pruning [25], and data reuse [6,14,15], and they are more suitable for relatively large, complex systems, like multicore processors with multiple memory hierarchies [5,6,14,15,17,22,25].…”