General-purpose code acceleration with limited-precision analog computation

Amant, Renée St.; Yazdanbakhsh, Amir; Park, Jongse; Thwaites, Bradley; Esmaeilzadeh, Hadi; Hassibi, Arjang; Ceze, Luís; Burger, Doug

doi:10.1145/2678373.2665746

Cited by 76 publications

(37 citation statements)

References 59 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hardware implementations for NNs have been developed using various forms of technology [58], including ASIC (both digital and analog) [67], [19], [41], [72], [1], FPGA [89], and neuromorphic hardware [75], [37], along with specialized fault-tolerant designs [4], [78], [36]. GPU implementations of NNs [44], [63] have also gained in popularity.…”

Section: Neural Network Implementationsmentioning

confidence: 99%

BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing

Grigorian¹,

Farahpour²,

Reinman³

2015

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

View full text Add to dashboard Cite

Applications with large amounts of data, real-time constraints, ultra-low power requirements, and heavy computational complexity present significant challenges for modern computing systems, and often fall within the category of high performance computing (HPC). As such, computer architects have looked to high performance single instruction multiple data (SIMD) architectures, such as accelerator-rich platforms, for handling these workloads. However, since the results of these applications do not always require exact precision, approximate computing may also be leveraged. In this work, we introduce BRAINIAC, a heterogeneous platform that combines precise accelerators with neural-network-based approximate accelerators. These reconfigurable accelerators are leveraged in a multi-stage flow that begins with simple approximations and resorts to more complex ones as needed. We employ high-level, applicationspecific light-weight checks (LWCs) to throttle this multi-stage acceleration flow and reliably ensure user-specified accuracy at runtime. Evaluation of the performance and energy of our heterogeneous platform for error tolerance thresholds of 5%-25% demonstrates an average of 3× gain over computation that only includes precise acceleration, and 15×-35× gain over softwarebased computation.

show abstract

Section: Neural Network Implementationsmentioning

confidence: 99%

BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing

Grigorian¹,

Farahpour²,

Reinman³

2015

2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)

View full text Add to dashboard Cite

show abstract

“…However, the data request is still sent to the memory. 2 When the data for an approximate load arrives, the core updates the prediction tables without checking the status of the approximate load that generated the request. Predictor design.…”

Section: Georgia Institute Of Technology Carnegie Mellon Universitymentioning

confidence: 99%

“…These techniques exploit the inherent error resiliency of a wide range of applications including web search, data analytics, image processing, cyber-physical systems, recognition, and optimization to improve performance and efficiency through approximation. Instances of these approximation techniques include (i) voltage over-scaling [9,4]; (ii) loop perforation [15]; (iii) loop early termination [3]; (iv) computation substitution [11,8,2,1,3]; (v) limited fault recovery [6]; and (vi) approximate storage design [10,13]. We define a new technique, rollbackfree value prediction, which operates at the fine granularity of a single load instruction.…”

Section: Introductionmentioning

confidence: 99%

Rollback-free value prediction with approximate loads

Thwaites

Pekhimenko

Esmaeilzadeh

et al. 2014

Proceedings of the 23rd International Conference on Parallel Architectures and Compilation

Self Cite

View full text Add to dashboard Cite

This paper demonstrates how to utilize the inherent error resilience of a wide range of applications to mitigate the memory wallthe discrepancy between core and memory speed. We define a new microarchitecturally-triggered approximation technique called rollback-free value prediction. This technique predicts the value of safe-to-approximate loads when they miss in the cache without tracking mispredictions or requiring costly recovery from misspeculations. This technique mitigates the memory wall by allowing the core to continue computation without stalling for long-latency memory accesses. Our detailed study of the quality trade-offs shows that with a modern out-of-order processor, average 8% (up to 19%) performance improvement is possible with 0.8% (up to 1.8%) average quality loss on an approximable subset of SPEC CPU 2000/2006

show abstract

“…• These techniques measure the quality of the whole output that is usually equal to the average quality of each individual output element, e.g., pixels in an image. Previous works [16,4] in approximate computing show that most of the output elements have small errors and there exist a few output elements that have considerably large errors, even though the average error is low. These large errors can degrade the whole user experience.…”

Section: Introductionmentioning

confidence: 99%

“…Software techniques include loop perforation [1], approximate memoization [11,31], tile approximation [31], discarding high overhead computations [32,36], and relaxed synchronization [28]. Furthermore, there are many hardware based approximation techniques that employ neural processing modules [16,4], analog circuits [4], low power ALUs and storage [34], dual voltage processors [15], hardware-based fuzzy memoization [2,3] and approximate memory modules [35]. Approximation accelerators [16,41,14] utilize these techniques to trade off accuracy for better performance and/or higher energy savings.…”

Section: Introductionmentioning

confidence: 99%

Rumba

Khudia

Zamirai

Samadi

et al. 2015

Proceedings of the 42nd Annual International Symposium on Computer Architecture

103

View full text Add to dashboard Cite

Approximate computing can be employed for an emerging class of applications from various domains such as multimedia, machine learning and computer vision. The approximated output of such applications, even though not 100% numerically correct, is often either useful or the difference is unnoticeable to the end user. This opens up a new design dimension to trade off application performance and energy consumption with output correctness. However, a largely unaddressed challenge is quality control: how to ensure the user experience meets a prescribed level of quality. Current approaches either do not monitor output quality or use sampling approaches to check a small subset of the output assuming that it is representative. While these approaches have been shown to produce average errors that are acceptable, they often miss large errors without any means to take corrective actions. To overcome this challenge, we propose Rumba for online detection and correction of large approximation errors in an approximate accelerator-based computing environment. Rumba employs continuous lightweight checks in the accelerator to detect large approximation errors and then fixes these errors by exact re-computation on the host processor. Rumba employs computationally inexpensive output error prediction models for efficient detection. Computing patterns amenable for approximation (e.g., map and stencil) are usually data parallel in nature and Rumba exploits this property for selective correction. Overall, Rumba is able to achieve 2.1x reduction in output error for an unchecked approximation accelerator while maintaining the accelerator performance gains at the cost of reducing the energy savings from 3.2x to 2.2x for a set of applications from different approximate computing domains.

show abstract

General-purpose code acceleration with limited-precision analog computation

Cited by 76 publications

References 59 publications

BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing

BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing

Rollback-free value prediction with approximate loads

Rumba

Contact Info

Product

Resources

About