High-level synthesis of approximate hardware under joint precision and voltage scaling

Lee, Seogoo; John, Lizy K.; Gerstlauer, Andreas

doi:10.23919/date.2017.7926980

Cited by 61 publications

(37 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A multi-objective design space exploration technique is used to identify the best set of approximate variants. Recently, a new technique is proposed to raise the level of abstraction by synthesizing approximate circuit directly from C descriptions [6]. High-level synthesis in conjunction with approximations on the critical path can yield additional savings through voltage scaling [6,12].…”

Section: Previous Workmentioning

confidence: 99%

“…The first line of research has devised custom approximate designs for typical arithmetic building blocks (e.g., adders, multipliers [2][3][4]15]). The second line has targeted approximation of more generic circuits either from gate-level (i.e., Boolean descriptions) [7,9,14,17,18], higher-level descriptions, such as RTL or behavioral descriptions [13], or even direct C to approximate hardware synthesis [6].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Blasys

Hashemi

Tann

Reda

2018

Proceedings of the 55th Annual Design Automation Conference

View full text Add to dashboard Cite

Approximate computing is an emerging paradigm where design accuracy can be traded off for benefits in design metrics such as design area, power consumption or circuit complexity. In this work, we present a novel paradigm to synthesize approximate circuits using Boolean matrix factorization (BMF). In our methodology the truth table of a sub-circuit of the design is approximated using BMF to a controllable approximation degree, and the results of the factorization are used to synthesize a less complex subcircuit. To scale our technique to large circuits, we devise a circuit decomposition method and a subcircuit design-space exploration technique to identify the best order for subcircuit approximations. Our method leads to a smooth trade-off between accuracy and full circuit complexity as measured by design area and power consumption. Using an industrial strength design flow, we extensively evaluate our methodology on a number of testcases, where we demonstrate that the proposed methodology can achieve up to 63% in power savings, while introducing an average relative error of 5%. We also compare our work to previous works in Boolean circuit synthesis and demonstrate significant improvements in design metrics for same accuracy targets.

show abstract

Section: Previous Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Blasys

Hashemi

Tann

Reda

2018

Proceedings of the 55th Annual Design Automation Conference

View full text Add to dashboard Cite

show abstract

“…Existing AC techniques can be roughly classified into two types from the perspective of granularity: fine-grained techniques (at the operation-level) and coarse-grained techniques (at the tasklevel). The fine-grained techniques aim at relaxing every operation, mainly on hardware (the register-transfer or transistor levels) for the sake of critical path delay reduction, such as (segmented) computational resources (e.g., adder and multiplier) [8,21,27,28] and least significant bits (LSBs) truncation [13,20]. These techniques are well suited for relatively small, simple systems like DSP circuits [7,13,20].…”

Section: Introductionmentioning

confidence: 99%

“…The fine-grained techniques aim at relaxing every operation, mainly on hardware (the register-transfer or transistor levels) for the sake of critical path delay reduction, such as (segmented) computational resources (e.g., adder and multiplier) [8,21,27,28] and least significant bits (LSBs) truncation [13,20]. These techniques are well suited for relatively small, simple systems like DSP circuits [7,13,20]. Contrarily, the coarse-grained techniques aim at reducing the amount of computations, such as task skipping [17,18], input sampling [22], pruning [25], and data reuse [6,14,15], and they are more suitable for relatively large, complex systems, like multicore processors with multiple memory hierarchies [5,6,14,15,17,22,25].…”

Section: Introductionmentioning

confidence: 99%

Approximate Data Reuse-based Accelerator Design for Embedded Processor

Osawa

Hara-Azumi

2019

ACM Trans. Des. Autom. Electron. Syst.

View full text Add to dashboard Cite

Due to increasing diversity and complexity of applications in embedded systems, accelerator designs tradingoff area/energy-efficiency and design-productivity are becoming a further crucial issue. Targeting applications in the category of Recognition, Mining, and Synthesis (RMS), this study proposes a novel accelerator design to achieve a good trade-off in efficiency and design-productivity (or reusability) by introducing a new computing paradigm called "approximate computing" (AC). Leveraging from the facts that frequently executed parts of applications (i.e., hotspots) are conventionally the target of acceleration and that RMS applications are error-tolerant and often take similar input data repeatedly, our proposed accelerator reuses previous computational results of similar enough data to reduce computations. The proposed accelerator is composed of a simple controller and a dedicated memory to store limited sets of previous input data with corresponding computational results in a hotspot. Therefore, this accelerator can be applied to different and/or multiple hotspots/applications only through small extension of the controller, to achieve efficient accelerator design and resolve the design-productivity issue. We conducted quantitative evaluations using a representative RMS application (image compression) to demonstrate the effectiveness of our method over conventional ones with precise computing. Moreover, we provide important findings on parameter exploration for our accelerator design, offering a wider applicability of our accelerator to other applications. CCS Concepts: • Computer systems organization → Embedded hardware;

show abstract

“…System designers can capitalize on this observation to trade energy e ciency and performance for precision ( neness of measurement resolution) and accuracy (di erence between measured signal values and the true signal value). These tradeo s have been investigated by several research e orts in the last decade (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16). Despite the signi cant research interest in e ciency versus precision and accuracy tradeo s however, no common open hardware platforms for research evaluation exist today.…”

mentioning

confidence: 99%

Warp: A Hardware Platform for Efficient Multimodal Sensing With Adaptive Approximation

Stanley-Marbell

Rinard

2020

IEEE Micro

View full text Add to dashboard Cite

We present Warp, a hardware platform to support research in approximate computing, sensor energy optimization, and energy-scavenged systems. Warp incorporates 11 state-of-the-art sensor integrated circuits, computation, and an energy-scavenged power supply, all within a miniature system that is just 3.6 cm×3.3 cm×0.5 cm. Warp's sensor integrated circuits together contain a total of 21 sensors with a range of precisions and accuracies for measuring eight sensing modalities of acceleration, angular rate, magnetic flux density (compass heading), humidity, atmospheric pressure (elevation), infrared radiation, ambient temperature, and color. Warp uses a combination of analog circuits and digital control to facilitate further tradeoffs between sensor and communication accuracy, energy efficiency, and performance. This article presents the design of Warp and presents an evaluation of our hardware implementation. The results show how Warp's design enables performance and energy efficiency versus accuracy tradeoffs.Approximate Computing | Approximate Communication | Sensors | Energy Scavenging

show abstract

High-level synthesis of approximate hardware under joint precision and voltage scaling

Cited by 61 publications

References 14 publications

Blasys

Blasys

Approximate Data Reuse-based Accelerator Design for Embedded Processor

Warp: A Hardware Platform for Efficient Multimodal Sensing With Adaptive Approximation

Contact Info

Product

Resources

About