In this work, we investigate various non-ideal effects (Stuck-At-Fault (SAF), IR-drop, thermal noise, shot noise, and random telegraph noise) of ReRAM crossbar when employing it as a dot-product engine for deep neural network (DNN) acceleration. In order to examine the impacts of those non-ideal effects, we first develop a comprehensive framework called PytorX based on mainstream DNN pytorch framework. PytorX could perform end-to-end training, mapping, and evaluation for crossbar-based neural network accelerator, considering all above discussed non-ideal effects of ReRAM crossbar together. Experiments based on PytorX show that directly mapping the trained large scale DNN into crossbar without considering these non-ideal effects could lead to a complete system malfunction (i.e., equal to random guess) when the neural network goes deeper and wider. In particular, to address SAF side effects, we propose a digital SAF error correction algorithm to compensate for crossbar output errors, which only needs one-time profiling to achieve almost no system accuracy degradation. Then, to overcome IR drop effects, we propose a Noise Injection Adaption (NIA) methodology by incorporating statistics of current shift caused by IR drop in each crossbar as stochastic noise to DNN training algorithm, which could efficiently regularize DNN model to make it intrinsically adaptive to non-ideal ReRAM crossbar. It is a one-time training method without the request of retraining for every specific crossbar. Optimizing system operating frequency could easily take care of rest non-ideal effects. Various experiments on different DNNs using image recognition application are conducted to show the efficacy of our proposed methodology.
The construction of clock trees for modern designs is challenging because the clock trees need to be constructed with adequate safety margins such that the skew constraints are satisfied even under variations. The amount of safety margin required in a skew constraint is dependent on the distance of the corresponding sequential elements in the tree topology. In certain cases, the amount of safety margin that can be inserted may be limited. Consequently, the corresponding sequential elements should be placed close in the topology, i.e., the point of divergence to these elements is low in the clock tree, in order to reduce the influence of variations. By using safety margins and lowering the point of divergence, we present a framework for the construction of useful skew trees with large safety margins inserted in the skew constraints. The framework, called UST-LSM, first identifies tight skew constraints by the detection of negative cycles in a weighted skew constraint graph. Next, the corresponding sequential elements of these skew constraints are clustered early in tree topology. Compared to earlier studies, we can allow larger safety margins in skew constraints spanning between sequential elements within a subtree. This translates into an improvement of yield from 46.8% to 98.8% on a synthesized benchmark with 7, 674 sequential elements and 63, 440 skew constraints.
Clock trees must be constructed to function even under the influence of on-chip variations (OCV). Bounding the latency of a clock tree, i.e., the maximum delay from the tree root to any sequential element, is important because the latency correlates with the maximum magnitude of the skews caused by OCV. In this paper, a latency constraint graph (LCG) that captures the latencies of a set of subtrees and the skew constraints between the subtrees is introduced. The minimum latency of a clock tree that can be constructed from the corresponding subtrees is equal to the (negative of the) length of a shortest path in the LCG, which can be computed in O(V E). Based on the LCG, we propose a framework that consists of a latency-aware clock tree synthesis (CTS) phase and a clock tree optimization (CTO) phase to construct latency-bounded clock trees. When applied to a set of synthesized circuits, the framework is capable of constructing latency-bounded clock trees that have higher yield compared to clock trees constructed in previous studies.
Neural SDEs with Brownian motion as noise lead to smoother attributions than traditional ResNets. Various attribution methods such as saliency maps, integrated gradients, DeepSHAP and DeepLIFT have been shown to be more robust for neural SDEs than for ResNets using the recently proposed sensitivity metric. In this paper, we show that neural SDEs with adaptive attribution-driven noise lead to even more robust attributions and smaller sensitivity metrics than traditional neural SDEs with Brownian motion as noise. In particular, attribution-driven shaping of noise leads to 6.7%, 6.9% and 19.4% smaller sensitivity metric for integrated gradients computed on three discrete approximations of neural SDEs with standard Brownian motion noise: stochastic ResNet-50, WideResNet-101 and ResNeXt-101 models respectively. The neural SDE model with adaptive attribution-driven noise leads to 25.7% and 4.8% improvement in the SIC metric over traditional ResNets and Neural SDEs with Brownian motion as noise. To the best of our knowledge, we are the first to propose the use of attributions for shaping the noise injected in neural SDEs, and demonstrate that this process leads to more robust attributions than traditional neural SDEs with standard Brownian motion as noise.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.