Pramod Udupa scite author profile

Pramod Udupa

5Publications

10Citation Statements Received

35Citation Statements Given

How they've been cited

How they cite others

Affiliations

Samsung (India), Indian Institute of Science Bangalore, Institut de Recherche en Informatique et Systèmes Aléatoires

Publications

Order By: Most citations

A Novel Hierarchical Low Complexity Synchronization Method for OFDM Systems

Udupa¹,

Sentieys²,

Scalart³

2013

View full text Add to dashboard Cite

International audienceA new hierarchical synchronization method is proposed for initial timing synchronization in orthogonal frequency- division multiplexing (OFDM) systems. Based on the proposal of new training symbol, a threshold based timing metric is designed for accurate estimation of start of OFDM symbol in a frequency selective channel. Threshold is defined in terms of noise distributions and false alarm, which makes it independent of type of channel it is applied. Frequency offset estimation is also done for the proposed training symbol. The performance of the proposed timing metric is evaluated using simulation results. The proposed method achieves low mean squared error (MSE) in timing offset estimation at five(5 x) times lower computational complexity compared to cross-correlation based method in a frequency selective channel. It is also computationally efficient compared to hybrid approaches for OFDM timing synchronization

show abstract

IKW: Inter-Kernel Weights for Power Efficient Edge Computing

et al. 2020

View full text Add to dashboard Cite

Deep Convolutional Neural Networks (CNN) have achieved state-of-the-art recognition accuracy in a wide range of computer vision applications like image classification, object detection, semantic segmentation etc. Applications based on CNN require millions of multiply-accumulate (MAC) operations to be performed between input pixels and kernel weights during inference. This work investigates a technique, which can be used to eliminate redundant multiplications for a subset of kernel weights in a CNN layer by utilizing identical and/or similar inter-kernel weights (IKW) across kernels. In this work, IKW technique is used to identify identical and/or similar inter-kernel weights in trained, unpruned/pruned, quantized CNN kernels before inference phase. After identification of identical and/or similar inter-kernel weights, a subset of kernel weights termed non-pivot kernel weights are made zero, the other subset called pivot kernel weights are left unchanged. The multiplication corresponding to non-pivot kernel weights are eliminated, thus reducing computations. The products corresponding to non-pivot kernel weights are supplied by multiplication operation of pivot kernel weights, and hence causing no degradation in inference accuracy. Through experiments on state-of-the-art CNNs, we demonstrate that application of IKW technique enhances kernel sparsity by 9-37% for 8-bit precision kernel weight and 18-43% for 4-bit precision kernel weight without degrading the recognition accuracy of the CNN model. Enhanced kernel sparsity can be used to save power by clock gating the compute unit, or increase execution performance by skipping computations pertaining to zero valued non-pivot kernel weights. In addition, power savings are achieved by eliminating redundant power expensive fixed-point multiplication operations. The practical utility of the IKW technique is demonstrated by mapping it to well-known state-of-the-art CNN accelerator architectures. Mapping of the IKW technique on existing CNN accelerator architectures shows reduction in power by at least 12% for 8-bit precision and 19% for 4-bit precision kernel weight. Improvement in execution performance by at least 2% for 8-bit precision and 13% for 4-bit precision kernel weight is observed. INDEX TERMS Inter-kernel weights, quantization, multiply-accumulate unit, split accumulator, kernel zero skipping, convolutional neural network, kernel pruning, identical kernel weights, similar kernel weights.

show abstract

A hardware accelerated system for deep packet inspection

Rao

Udupa

2010

View full text Add to dashboard Cite

Accelerating Numerical Linear Algebra Kernels on a Scalable Run Time Reconfigurable Platform

Biswas¹,

Udupa²,

Mondal³

et al. 2010

View full text Add to dashboard Cite

WinDConv: A Fused Datapath CNN Accelerator for Power-Efficient Edge Devices

Mahale

Udupa

Chandrasekharan

et al. 2020

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Pramod Udupa

A Novel Hierarchical Low Complexity Synchronization Method for OFDM Systems

IKW: Inter-Kernel Weights for Power Efficient Edge Computing

A hardware accelerated system for deep packet inspection

Accelerating Numerical Linear Algebra Kernels on a Scalable Run Time Reconfigurable Platform

WinDConv: A Fused Datapath CNN Accelerator for Power-Efficient Edge Devices

Contact Info

Product

Resources

About