Mehrzad Samadi scite author profile

Approximate computing, where computation accuracy is traded off for better performance or higher data throughput, is one solution that can help data processing keep pace with the current and growing abundance of information. For particular domains, such as multimedia and learning algorithms, approximation is commonly used today. We consider automation to be essential to provide transparent approximation, and we show that larger benefits can be achieved by constructing the approximation techniques to fit the underlying hardware. Our target platform is the GPU because of its high performance capabilities and difficult programming challenges that can be alleviated with proper automation. Our approach-SAGE-combines a static compiler that automatically generates a set of CUDA kernels with varying levels of approximation with a runtime system that iteratively selects among the available kernels to achieve speedup while adhering to a target output quality set by the user. The SAGE compiler employs three optimization techniques to generate approximate kernels that exploit the GPU microarchitecture: selective discarding of atomic operations, data packing, and thread fusion. Across a set of machine learning and image processing kernels, SAGE's approximation yields an average of 2.5× speedup with less than 10% quality loss compared to the accurate execution on a NVIDIA GTX 560 GPU.

show abstract

Input responsiveness: using canary inputs to dynamically steer approximation

Laurenzano

Hill

Samadi

et al. 2016

View full text Add to dashboard Cite

This paper introduces Input Responsive Approximation (IRA), an approach that uses a canary input-a small program input carefully constructed to capture the intrinsic properties of the original input-to automatically control how program approximation is applied on an input-by-input basis. Motivating this approach is the observation that many of the prior techniques focusing on choosing how to approximate arrive at conservative decisions by discounting substantial differences between inputs when applying approximation. The main challenges in overcoming this limitation lie in making the choice of how to approximate both effectively (e.g., the fastest approximation that meets a particular accuracy target) and rapidly for every input. With IRA, each time the approximate program is run, a canary input is constructed and used dynamically to quickly test a spectrum of approximation alternatives. Based on these runtime tests, the approximation that best fits the desired accuracy constraints is selected and applied to the full input to produce an approximate result. We use IRA to select and parameterize mixes of four approximation techniques from the literature for a range of 13 image processing, machine learning, and data mining applications. Our results demonstrate that IRA significantly outperforms prior approaches, delivering an average of 10.2× speedup over exact execution while minimizing accuracy losses in program outputs.

show abstract

Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism

Mehrara

Hsu

Samadi

et al. 2011

View full text Add to dashboard Cite

Dynamic Voltage and Frequency Scheduling for Embedded Processors Considering Power/Performance Tradeoffs

Salehi

Samadi

Najibi

et al. 2011

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

Abstract-An adaptive method to perform dynamic voltage and frequency scheduling (DVFS) for minimizing the energy consumption of microprocessor chips is presented. Instead of using a fixed update interval, the proposed DVFS system makes use of adaptive update intervals for optimal frequency and voltage scheduling. The optimization enables the system to rapidly track the workload changes so as to meet soft real-time deadlines. The technique, which can be realized with very simple hardware, is completely transparent to the application. The results of applying the method to some real application workloads demonstrate considerable power savings and fewer frequency updates compared to DVFS systems based on fixed update intervals.

show abstract

Adaptive input-aware compilation for graphics engines

Samadi

Hormati

Mehrara

et al. 2012

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Mehrzad Samadi

Scaling Performance via Self-Tuning Approximation for Graphics Engines

Input responsiveness: using canary inputs to dynamically steer approximation

Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism

Dynamic Voltage and Frequency Scheduling for Embedded Processors Considering Power/Performance Tradeoffs

Adaptive input-aware compilation for graphics engines

Contact Info

Product

Resources

About