Pruned and Structurally Sparse Neural Networks

Alford, Simon; Robinett, Ryan; Milechin, Lauren; Kepner, Jeremy

doi:10.1109/urtc45901.2018.9244787

Cited by 18 publications

(18 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Networks are trained to produce desired outputs by adjusting their weights via numerical methods such as gradient descent. Training induces functional suppression of insignificant or redundant edges, which may be supplemented by explicit pruning [32,38]. Such processes are analogous to biological synaptic pruning, an important aspect of maturation [33,34].…”

Section: Background and Motivationmentioning

confidence: 99%

“…In particular, hybrid local/random networks constructed by augmenting convolutional neural networks [35,36,37] with sparse random structure exhibit superior connectivity properties at reduced computational cost. Novel pseudorandom designs have already eclipsed standard architectures in accuracy and efficiency [38,39,40]. Such methods may allow construction of networks capable of next-generation tasks such as recognition of individuals among a large population, while democratizing access to state-of-the art technology.…”

Section: Background and Motivationmentioning

confidence: 99%

“…However, many other architectures are possible, including RNNs and even networks of networks [77]. Absolute connectivity properties of DNNs have traditionally remained fixed following design, though subsequent alterations such as pruning have recently become common [78,32,33,34,38]. However, functional connectivity is determined not only by network topology, but also by edge weights, which change as the network learns.…”

Section: Deep Learningmentioning

confidence: 99%

See 2 more Smart Citations

Network Horizon Dynamics I: Qualitative Aspects

Dribus,

Sumner,

Bist

et al. 2019

Preprint

View full text Add to dashboard Cite

Mostly acyclic directed networks, treated mathematically as directed graphs, arise in machine learning, biology, social science, physics, and other applications. Newman [1] has noted the mathematical challenges of such networks. In this series of papers, we study their connectivity properties, focusing on three types of phase transitions that affect horizon sizes for typical nodes. The first two types involve the familiar emergence of giant components as average local connectivity increases, while the third type involves small-world horizon growth at variable distance from a typical node. In this first paper, we focus on qualitative behavior, simulations, and applications, leaving formal considerations for subsequent papers. We explain how such phase transitions distinguish deep neural networks from shallow machine learning architectures, and propose hybrid local/random network designs with surprising connectivity advantages. We also propose a small-world approach to the horizon problem in the cosmology of the early universe as a novel alternative to the inflationary hypothesis of Guth and Linde.

show abstract

Section: Background and Motivationmentioning

confidence: 99%

Section: Background and Motivationmentioning

confidence: 99%

Section: Deep Learningmentioning

confidence: 99%

See 1 more Smart Citation

Network Horizon Dynamics I: Qualitative Aspects

Dribus,

Sumner,

Bist

et al. 2019

Preprint

View full text Add to dashboard Cite

show abstract

“…Sparsely-connected neural networks exhibit lower computational complexity and lower memory requirements compared to their dense counterparts. They may originate by pruning a dense network as in the Banded Sparse Neural Networks [3], or result from training a fixed sparse topology as in the RadiX-Net [4]. The input data matrix may also be sparse, due to feature extraction techniques generating sparse representations (from, e.g., image, video, or signal data), or because input may be naturally sparse (e.g., graph inputs).…”

Section: Introductionmentioning

confidence: 99%

“…Images are interpolated to the number of neurons in the neural networks: 1024, 4096, 16384, and 65536. Several deep sparse neural networks are generated using RadiX-Net [4], with the number of neurons, layers, and bytes in Table I. This size, in bytes, assumes Compressed Row Storage (CRS) using four-byte values and indices.…”

Section: Introductionmentioning

confidence: 99%

Combinatorial Tiling for Sparse Neural Networks

Pawłowski

Bisseling²,

Uçar

et al. 2020

2020 IEEE High Performance Extreme Computing Conference (HPEC)

View full text Add to dashboard Cite

Sparse deep neural networks (DNNs) emerged as the result of search for networks with less storage and lower computational complexity. The sparse DNN inference is the task of using such trained DNN networks to classify a batch of input data. We propose an efficient, hybrid model-and data-parallel DNN inference using hypergraph models and partitioners. We exploit tiling and weak synchronization to increase cache reuse, hide load imbalance, and hide synchronization costs. Finally, a blocking approach allows application of this new hybrid inference procedure for deep neural networks. We initially experiment using the hybrid tiled inference approach only, using the first five layers of networks from the IEEE HPEC 2019 Graph Challenge, and attain up to 2× speedup versus a data-parallel baseline.

show abstract