Applying Reduced Precision Arithmetic to Detect Errors in Floating Point Multiplication

Seetharam, Kushal; Keh, Lance Co Ting; Nathan, Ralph; Sorin, Daniel J.

doi:10.1109/prdc.2013.44

Cited by 8 publications

(12 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It mainly considers detection of very large errors, and does not treat rounding which is specific to floating-point arithmetic. Error detection ratio based on this technique reported in [13] is very low as the authors have described. The multiplier based on the method will be used for applications which need to detect only very large errors.…”

Section: Introductionmentioning

confidence: 70%

“…Reliable floatingpoint arithmetic circuits utilizing methods other than full duplication and residue checking were proposed [11]- [14], but, to the best of our knowledge, are not actually used. Recently, reduced precision checking is proposed for floating-point addition and floating-point multiplication in [12] and [13], respectively. The technique uses a small significand adder or a small significand multiplier for checking of the result.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Floating-Point Multiplier with Concurrent Error Detection Capability by Partial Duplication

Kito

Akimoto

Takagi

2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYA floating-point multiplier with concurrent error detection capability by partial duplication is proposed. It uses a truncated multiplier for checking of the significand (mantissa) multiplication instead of full duplication. The proposed multiplier can detect any erroneous output with error larger than one unit in the last place (1 ulp) of the significand, which may be overlooked by residue checking. Its circuit area is smaller than that of a fully duplicated one. Area overhead of a single-precision multiplier is about 78% and that of a double-precision one is about 65%.

show abstract

Section: Introductionmentioning

confidence: 70%

Section: Introductionmentioning

confidence: 99%

Floating-Point Multiplier with Concurrent Error Detection Capability by Partial Duplication

Kito

Akimoto

Takagi

2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…However, as shown in table 3, the required area and power overheads in our error detecting scheme are much lower, even with a longer mantissa. For example, the design with a 17-bit mantissa requires only 5% and 2.7% area and power overheads, respectively, while its precision is much higher than that of the checker in [20]. In addition, based on [15] the 11-bit mantissa is enough for many applications that verifies our proposed faulttolerant designs.…”

Section: Resultsmentioning

confidence: 96%

“…As mentioned in Section II, the RPC technique [20] can be used for detecting errors in the floating-point multiplier. This technique in which the 32-bit floatingpoint multiplier is checked by a k-bit (k<23) reducedprecision floating-point checker multiplier, requires the area and power overheads equal to 17.8% and 35%, respectively, for the checker multiplier in which the mantissa with the size of only 7 bits has been used.…”

Section: Resultsmentioning

confidence: 99%

“…In the last stage of the design there is a hardware that compares the results and determines whether there is an error or not. In [20] the Reduced Precision Checking (RPC) technique has been applied to the floating-point multiplier to detect errors. This study shows that the RPC can successfully detect errors in floating-point multiplication at relatively low cost but cannot correct errors.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

A Novel Reduced-Precision Fault-Tolerant Floating-Point Multiplier

Mohajer¹,

Valinataj²

2017

IJMECS

View full text Add to dashboard Cite

Abstract-This paper presents a new fault-tolerant architecture for floating-point multipliers in which the fault-tolerance capability is achieved at the cost of output precision reduction. In this approach, to achieve the faulttolerant floating-point multiplier, the hardware cost of the primary design is reduced by output precision reduction. Then, the appropriate redundancy is utilized to provide error detection/correction in such a way that the overall required hardware becomes almost the same as the primary multiplier. The proposed multiplier can tolerate a variety of permanent and transient faults regarding the acceptable reduced precisions in many applications. The implementation results reveal that the 17-bit and 14-bit mantissas are enough to obtain a floating-point multiplier with error detection or error correction, respectively, instead of the 23-bit mantissa in the IEEE-754 standardbased multiplier with a few percent area and power overheads.

show abstract

Understanding and Improving GPUs' Reliability Combining Beam Experiments with Fault Simulation

Santos

Carro

Rech

2023

2023 IEEE European Test Symposium (ETS)

View full text Add to dashboard Cite

Graphics Processing Units (GPUs) are essential in High Performance Computing (HPC) and safety-critical applications like autonomous vehicles. This market shift led to significant improvements in the programming frameworks and evaluation tools and concerns about their reliability. However, GPUs' high complexity poses challenges in evaluating their reliability. We conducted the first cross-layer GPU reliability evaluation to unveil and mitigate GPU vulnerabilities. The proposed evaluation is achieved by comparing and combining extensive neutron beam experiments, fault simulation campaigns, and application profiling. Based on this detailed analysis, a novel methodology to accurately estimate GPUs application FIT rate is proposed. The cross-layer evaluation enables two novel hardening solutions: (1) Reduced Precision Duplication With Comparison (RP-DWC) executes a redundant copy in reduced precision. RP-DWC delivers excellent fault coverage, up to 86%, with minimal execution time and energy consumption overheads (13% and 24%, respectively).(2) Dedicated software solutions for hardening Convolutional Neural Networks (CNNs) can detect up to 98% of errors.

show abstract

Applying Reduced Precision Arithmetic to Detect Errors in Floating Point Multiplication

Cited by 8 publications

References 5 publications

Floating-Point Multiplier with Concurrent Error Detection Capability by Partial Duplication

Floating-Point Multiplier with Concurrent Error Detection Capability by Partial Duplication

A Novel Reduced-Precision Fault-Tolerant Floating-Point Multiplier

Understanding and Improving GPUs' Reliability Combining Beam Experiments with Fault Simulation

Contact Info

Product

Resources

About