“…Indeed, prior efforts spanning hardware to software and algorithms have exploited sparsity to eliminate computation or data transfers at different points in DNN computations. Most of these efforts, though, require hardware changes [3,7,13,34,38,40,57] and/or apply only to inference [3,7,13,15,34,35,50,53,57]. This is not ideal, since most of real-world DNN computations are performed on conventional CPUs and GPUs [4,16,33,51], and significant time goes into training.…”