Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), which is motivated by the idea of the ensemble learning to improve generalization capacity of DNNs. EDM can be applied to hidden layers in fully connected neural networks or convolutional neural networks. We treat each hidden layer as an ensemble of several base learners through dividing all the hidden units into several non-overlap groups, and each group will be viewed as a base learner. EDM encourages DNNs to learn more diverse representations by minimizing the covariance between all base learners during the training step. Experimental results on MNIST and CIFAR datasets demonstrate that EDM can effectively reduce the overfitting and improve the generalization capacity of DNNs
In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and Wiki-Text benchmark demonstrate the effectiveness of TSLM.
-The elongation prediction of strips in furnace is extremely important in annealing process, which determines the quality and yield of product [1, 2]. Furthermore, the safety of airknife also depends on the prediction accuracy [3, 4]. Thus, the optimal soft-sensing method is proposed based on kernel principal component analysis (KPCA) and optimized weighted least squares support vector machine (WLSSVM) by immune clone particle swarm optimization (ICPSO). Avoiding the particles are easy to sink into premature convergence and run into local optimization in the iterative process by using ICPSO , which generated by particle swarm optimization (PSO) algorithm, and the ICPSO is also used to optimize the parameters of WLSSVM. Then, the method uses KPCA to denoise the input data set and capture the high-dimensional nonlinear principal components in input data space, and the principal components are input into the ICPSO-WLSSVM model to establish the softsensing prediction model. The proposed method is successfully applied in the strip elongation prediction in annealing furnace. The simulations show that the KPCA and ICPSO-WLSSVM model has higher prediction accuracy compared with other algorithms that verified with production data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.