A cache line size has a significant effect on miss rate and memory traffic. Today's computers use a fixed line size, typically 32B, which may not be optimal for a given application. Optimal size may also change during application execution. This paper describes a cache in which the line (fetch) size is continuously adjusted by hardware based on observed application accesses to the line. The approach can improve the miss rate, even over the optimal for the fixed line size, as well as significantly reduce the memory traffic.
Software-assisted cache coherence enforcement schemes for large multiprocensor systams with shared global memory and interconnection network have gained increasing attention. P r o p o d softwareaasisted approaches rely on either indiscriminate invalidation or selective invalidation to invalidate stale cache lines. The indiscriminate approach combined with advanced memory hardware can quickly invalidate the entire cache but may result in lower hit ratios. The selective approach may achieve a better hit ratio. However, sequential selection and invalidation of cache or TLB entries is time consuming. We propoae a new solution that offers the fast operation of the indiscriminate invalidation approach and can selectively invalidate cache items without extensive run-time book-keeping and checking. The solution relies on the combination of compiletime reference tagging and individual invalidation of potentially stale cache lines only when referenced. Paformance improvement over an indiscriminate invalidation approach is presented.
Abstract-Neural network simulators that take into account the spiking behavior of neurons are useful for studying brain mechanisms and for engineering applications. Spiking Neural Network (SNN) simulators have been traditionally simulated on large-scale clusters, super-computers, or on dedicated hardware architectures. Alternatively, Graphics Processing Units (GPUs) can provide a low-cost, programmable, and highperformance computing platform for simulation of SNNs. In this paper we demonstrate an efficient, Izhikevich neuron based large-scale SNN simulator that runs on a single GPU. The GPU-SNN model (running on an NVIDIA GTX-280 with 1GB of memory), is up to 26 times faster than a CPU version for the simulation of 100K neurons with 50 Million synaptic connections, firing at an average rate of 7Hz. For simulation of 100K neurons with 10 Million synaptic connections, the GPU-SNN model is only 1.5 times slower than real-time. Further, we present a collection of new techniques related to parallelism extraction, mapping of irregular communication, and compact network representation for effective simulation of SNNs on GPUs. The fidelity of the simulation results were validated against CPU simulations using firing rate, synaptic weight distribution, and inter-spike interval analysis. We intend to make our simulator available to the modeling community so that researchers will have easy access to large-scale SNN simulations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.