Noise injection for training artificial neural networks: A comparison with weight decay and early stopping

Zur, Richard M.; Jiang, Yulei; Pesce, Lorenzo L.; Drukker, Karen

doi:10.1118/1.3213517

Cited by 164 publications

(76 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Random Noise injection to the weights or the hidden units has been utilized in many neural network researches for many years [15], [19]- [21]. Kurita et al [9] injected noise into the hidden layers of an MLP and showed the network ability to get automatically structurized by simply adding the noises and therefore improving the generalization ability of the network.…”

Section: Related Workmentioning

confidence: 99%

Effect of Additive Noise for Multi-Layered Perceptron with AutoEncoders

Sabri

Kurita

2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThis paper investigates the effect of noises added to hidden units of AutoEncoders linked to multilayer perceptrons. It is shown that internal representation of learned features emerges and sparsity of hidden units increases when independent Gaussian noises are added to inputs of hidden units during the deep network training. It is also shown that the weights that connect the contaminated hidden units with the next layer have smaller values and outputs of hidden units tend to be more definite (0 or 1). This is expected to improve the generalization ability of the network through this automatic structuration by adding the noises. This network structuration was confirmed by experiments for MNIST digits classification via a deep neural network model.

show abstract

Section: Related Workmentioning

confidence: 99%

Effect of Additive Noise for Multi-Layered Perceptron with AutoEncoders

Sabri

Kurita

2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…Therefore, to avoid over-fitting in the proposed RNN model, three standard techniques were used. These were Gaussian noise injection into the training data [32], using a ReLU activation function in the hidden layers [20], and subsequently applying the dropout technique [29]. Dropout leads to big improvements in the prediction performance of the model.…”

Section: Mitigating Overfittingmentioning

confidence: 99%

Severity Prediction of Traffic Accidents with Recurrent Neural Networks

2017

View full text Add to dashboard Cite

Abstract:In this paper, a deep learning model using a Recurrent Neural Network (RNN) was developed and employed to predict the injury severity of traffic accidents based on 1130 accident records that have occurred on the North-South Expressway (NSE), Malaysia over a six-year period from 2009 to 2015. Compared to traditional Neural Networks (NNs), the RNN method is more effective for sequential data, and is expected to capture temporal correlations among the traffic accident records. Several network architectures and configurations were tested through a systematic grid search to determine an optimal network for predicting the injury severity of traffic accidents. The selected network architecture comprised of a Long-Short Term Memory (LSTM) layer, two fully-connected (dense) layers and a Softmax layer. Next, to avoid over-fitting, the dropout technique with a probability of 0.3 was applied. Further, the network was trained with a Stochastic Gradient Descent (SGD) algorithm (learning rate = 0.01) in the Tensorflow framework. A sensitivity analysis of the RNN model was further conducted to determine these factors' impact on injury severity outcomes. Also, the proposed RNN model was compared with Multilayer Perceptron (MLP) and Bayesian Logistic Regression (BLR) models to understand its advantages and limitations. The results of the comparative analyses showed that the RNN model outperformed the MLP and BLR models. The validation accuracy of the RNN model was 71.77%, whereas the MLP and BLR models achieved 65.48% and 58.30% respectively. The findings of this study indicate that the RNN model, in deep learning frameworks, can be a promising tool for predicting the injury severity of traffic accidents.

show abstract

“…The first element of the inverse-forward ANNs is the inverse ANN which is trained with noisy data and is responsible to filter the noise, while the second element is the forward ANN which eliminates the necessity of FEM. Training with noise-free FEM data of the inverse ANN would result in overfitting and its subsequent failure (Arbabi et al, 2015a;Zur et al, 2009). The inverse ANN trained with noisy data (Gaussian random noise) is most sensitive to the general trend of the experimental data without being influenced by small deviations from FEM caused by uncertainties involved in the experimental data.…”

Section: Discussionmentioning

confidence: 99%

“…Our previous study showed that ANNs are very sensitive to any deviations from the underlying computational model that is used for their training (Arbabi et al, 2015a). To alleviate this problem, the training data of ANNs can be contaminated with some level of noise (Arbabi et al, 2015a;Derks et al, 1997;Zur et al, 2009) to increase the robustness of ANN. Therefore, we trained the ANN using the input concentration vs. time curves contaminated with different levels of Gaussian noise, i.e.…”

Section: Inverse-forward Artificial Neural Networkmentioning

confidence: 99%

Combined inverse-forward artificial neural networks for fast and accurate estimation of the diffusion coefficients of cartilage based on multi-physics models

Arbabi

Pouran

Zadpoor

2016

Journal of Biomechanics

View full text Add to dashboard Cite

Combined inverse-forward artificial neural networks for fast and accurate estimation of the diffusion coefficients of cartilage based on multiphysics models, Journal of Biomechanics, http://dx.doi.org/10. 1016/j.jbiomech.2016.06.019 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting galley proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. computationally inexperienced users to obtain accurate and fast estimation of the diffusion coefficients of cartilage zones. The diffusion coefficients estimated using the proposed approach are compared with those determined using direct scanning of the parameter space as the optimization approach. It has been shown that both approaches yield comparable results.

show abstract

Noise injection for training artificial neural networks: A comparison with weight decay and early stopping

Cited by 164 publications

References 30 publications

Effect of Additive Noise for Multi-Layered Perceptron with AutoEncoders

Effect of Additive Noise for Multi-Layered Perceptron with AutoEncoders

Severity Prediction of Traffic Accidents with Recurrent Neural Networks

Combined inverse-forward artificial neural networks for fast and accurate estimation of the diffusion coefficients of cartilage based on multi-physics models

Contact Info

Product

Resources

About