In recent years, convolutional neural networks (CNNs) have shown great performance in various fields such as image classification, pattern recognition, and multi-media compression. Two of the feature properties, local connectivity and weight sharing, can reduce the number of parameters and increase processing speed during training and inference. However, as the dimension of data becomes higher and the CNN architecture becomes more complicated, the endto-end approach or the combined manner of CNN is computationally intensive, which becomes limitation to CNN's further implementation. Therefore, it is necessary and urgent to implement CNN in a faster way. In this paper, we first summarize the acceleration methods that contribute to but not limited to CNN by reviewing a broad variety of research papers. We propose a taxonomy in terms of three levels, i.e. structure level, algorithm level, and implementation level, for acceleration methods. We also analyze the acceleration methods in terms of CNN architecture compression, algorithm optimization, and hardware-based improvement. At last, we give a discussion on different perspectives of these acceleration and optimization methods within each level. The discussion shows that the methods in each level still have large exploration space. By incorporating such a wide range of disciplines, we expect to provide a comprehensive reference for researchers who are interested in CNN acceleration.
As minimum feature size and pitch spacing further scale down, triple patterning lithography is a likely 193 nm extension along the paradigm of double patterning lithography for 14-nm technology node. Layout decomposition, which divides input layout into several masks to minimize the conflict and stitch numbers, is a crucial design step for double/triple patterning lithography. In this paper, we present a systematic study on triple patterning layout decomposition problem, which is shown to be NP-hard. Because of the NP-hardness, the runtime required to exactly solve it increases dramatically with the problem size. We first propose a set of graph division techniques to reduce the problem size. Then, we develop integer linear programming (ILP) to solve it. For large layouts, even with the graph-division techniques, ILP may still suffer from serious runtime overhead. To achieve better trade-off between runtime and performance, we present a novel semidefinite programming (SDP)-based algorithm. Followed by a mapping process, we can translate the SDP solutions into the final decomposition solutions. Experimental results show that the graph division can reduce runtime dramatically. In addition, SDP-based algorithm can achieve great speedup even compared with accelerated ILP, with very comparable results in terms of the stitch number and the conflict number.Index Terms-Graph division, integer linear programming (ILP), layout decomposition, semidefinite programming (SDP), triple patterning lithography (TPL).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.