Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. Dropout has played an essential role in many successful deep neural networks, by inducing regularization in the model training. In this paper, we present a new regularized training approach: Shakeout. Instead of randomly discarding units as Dropout does at the training stage, Shakeout randomly chooses to enhance or reverse each unit's contribution to the next layer. This minor modification of Dropout has the statistical trait: the regularizer induced by Shakeout adaptively combines , and regularization terms. Our classification experiments with representative deep architectures on image datasets MNIST, CIFAR-10 and ImageNet show that Shakeout deals with over-fitting effectively and outperforms Dropout. We empirically demonstrate that Shakeout leads to sparser weights under both unsupervised and supervised settings. Shakeout also leads to the grouping effect of the input units in a layer. Considering the weights in reflecting the importance of connections, Shakeout is superior to Dropout, which is valuable for the deep model compression. Moreover, we demonstrate that Shakeout can effectively reduce the instability of the training process of the deep architecture.
The superconducting film of (Li 1-x Fe x )OHFeSe is reported for the first time.The thin film exhibits a small in-plane crystal mosaic of 0. Moreover, a large critical current density (J c ) of a value over 0.5 MA/cm 2 is achieved at ~20 K. Such a (Li 1-x Fe x )OHFeSe film is therefore not only important to the fundamental research for understanding the high-T c mechanism, but also promising in the field of high-T c superconductivity application, especially in high-performance electronic devices and large scientific facilities such as superconducting accelerator.
Superconductivity and ferromagnetism are two mutually antagonistic states in condensed matter. Research on the interplay between these two competing orderings sheds light not only on the cause of various quantum phenomena in strongly correlated systems but also on the general mechanism of superconductivity. Here we report on the observation of the electronic entanglement between superconducting and ferromagnetic states in hydrogenated boron-doped nanodiamond films, which have a superconducting transition temperature T ∼ 3 K and a Curie temperature T > 400 K. In spite of the high T, our nanodiamond films demonstrate a decrease in the temperature dependence of magnetization below 100 K, in correspondence to an increase in the temperature dependence of resistivity. These anomalous magnetic and electrical transport properties reveal the presence of an intriguing precursor phase, in which spin fluctuations intervene as a result of the interplay between the two antagonistic states. Furthermore, the observations of high-temperature ferromagnetism, giant positive magnetoresistance, and anomalous Hall effect bring attention to the potential applications of our superconducting ferromagnetic nanodiamond films in magnetoelectronics, spintronics, and magnetic field sensing.
A major progress in deep multilayer neural networks (DNNs) is the invention of various unsupervised pretraining methods to initialize network parameters which lead to good prediction accuracy. This paper presents the sparseness analysis on the hidden unit in the pretraining process. In particular, we use the L -norm to measure sparseness and provide some sufficient conditions for that pretraining leads to sparseness with respect to the popular pretraining models-such as denoising autoencoders (DAEs) and restricted Boltzmann machines (RBMs). Our experimental results demonstrate that when the sufficient conditions are satisfied, the pretraining models lead to sparseness. Our experiments also reveal that when using the sigmoid activation functions, pretraining plays an important sparseness role in DNNs with sigmoid (Dsigm), and when using the rectifier linear unit (ReLU) activation functions, pretraining becomes less effective for DNNs with ReLU (Drelu). Luckily, Drelu can reach a higher recognition accuracy than DNNs with pretraining (DAEs and RBMs), as it can capture the main benefit (such as sparseness-encouraging) of pretraining in Dsigm. However, ReLU is not adapted to the different firing rates in biological neurons, because the firing rate actually changes along with the varying membrane resistances. To address this problem, we further propose a family of rectifier piecewise linear units (RePLUs) to fit the different firing rates. The experimental results show that the performance of RePLU is better than ReLU, and is comparable with those with some pretraining techniques, such as RBMs and DAEs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.