Loss surface of XOR artificial neural networks

Mehta, Dhagash; Zhao, Xiaojun; Bernal, Edgar A.; Wales, David J.

doi:10.1103/physreve.97.052307

Cited by 24 publications

(24 citation statements)

References 98 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Since the regularisation term is a convex L2 penalty, it is possible that part of the single-funnelled appearance of the reduced-connectivity networks is due purely to regularisation; i.e.higher L2 regularisation convexifies the landscape [15]. Again, for the fully-connected case, we observed a single-funnelled appearance, substantiating our previous suggestion that this type of landscape is architecture dependent.…”

Section: Landscapes With Reduced Connectivitysupporting

confidence: 88%

“…where c(α) is the known outcome for input data item α in the training set. The regularisation term biases against large values for the weights and shifts any zero eigenvalues of the Hessian (second derivative) matrix, which would otherwise complicate transition state searches [15,23]. To accelerate computation of the potential, a GPU version [24] of the loss function and gradient was also implemented and is available in the public domain GMIN and OPTIM programmes [25][26][27].…”

Section: Defining the Networkmentioning

confidence: 99%

“…To avoid the problems of low-dimensional projection and restrictive theoretical assumptions, the present work builds on previous considerations of the LFL as an EL [12][13][14][15]. ELs in molecular science [12,13,[16][17][18] are defined in terms of the potential energy (PE), with minima corresponding to physically stable structures, which can interconvert via transition states.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Perspective: new insights from loss function landscapes of neural networks

Chitturi

Verpoort

Lee

et al. 2020

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

We investigate the structure of the loss function landscape for neural networks subject to dataset mislabelling, increased training set diversity, and reduced node connectivity, using various techniques developed for energy landscape exploration. The benchmarking models are classification problems for atomic geometry optimisation and hand-written digit prediction. We consider the effect of varying the size of the atomic configuration space used to generate initial geometries and find that the number of stationary points increases rapidly with the size of the training configuration space. We introduce a measure of node locality to limit network connectivity and perturb permutational weight symmetry, and examine how this parameter affects the resulting landscapes. We find that highly-reduced systems have low capacity and exhibit landscapes with very few minima. On the other hand, small amounts of reduced connectivity can enhance network expressibility and can yield more complex landscapes. Investigating the effect of deliberate classification errors in the training data, we find that the variance in testing AUC, computed over a sample of minima, grows significantly with the training error, providing new insight into the role of the variance-bias trade-off when training under noise. Finally, we illustrate how the number of local minima for networks with two and three hidden layers, but a comparable number of variable edge weights, increases significantly with the number of layers, and as the number of training data decreases. This work helps shed further light on neural network loss landscapes and provides guidance for future work on neural network training and optimisation.

show abstract

Section: Landscapes With Reduced Connectivitysupporting

confidence: 88%

Section: Defining the Networkmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Perspective: new insights from loss function landscapes of neural networks

Chitturi

Verpoort

Lee

et al. 2020

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

show abstract

“…The XOR problem requires the NN to model the "exclusive-or" logical gate using four binary patterns of two inputs and one output. Despite seeming triviality, the XOR problem is not linearly separable, and thus makes a good case study for fundamental NN properties [26]. The MNIST dataset of handwritten digits [27] contains 70 000 examples of grey scale handwritten digits from 0 to 9, where 60 000 examples constitute the training set, and the remaining 10 000 constitute the test set.…”

Section: A Benchmark Problemsmentioning

confidence: 99%

Loss Surface Modality of Feed-Forward Neural Network Architectures

Bosman

Engelbrecht

Helbig

2020

2020 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

It has been argued in the past that high-dimensional neural networks do not exhibit local minima capable of trapping an optimisation algorithm. However, the relationship between loss surface modality and the neural architecture parameters, such as the number of hidden neurons per layer and the number of hidden layers, remains poorly understood. This study employs fitness landscape analysis to study the modality of neural network loss surfaces under various feed-forward architecture settings. An increase in the problem dimensionality is shown to yield a more searchable and more exploitable loss surface. An increase in the hidden layer width is shown to effectively reduce the number of local minima, and simplify the shape of the global attractor. An increase in the architecture depth is shown to sharpen the global attractor, thus making it more exploitable.

show abstract

“…En esta misma línea, las redes neuronales artificiales (RNAs) han sido la herramienta del soft-computing más utilizada para tareas que requieren el reconocimiento de patrones en un conjunto de datos, como imágenes. Una RNA está formada de varios niveles y números de neuronas artificiales, que se constituyen en la unidad de procesamiento, cuyo modelo matemático permite tener varias entradas de datos y una sola saluda que es la ponderación de sus entradas [14], [15]. La conexión de varias neuronas dentro de una RNA constituye una poderosa herramienta de cálculo paralelo, pero capaz de entregar salida aproximativas y no definitivas.…”

Section: Ta Perspectivasunclassified

Clasificador de Productos Agrícolas para Control de Calidad basado en Machine Learning e Industria 4.0

González¹,

Hernández²

2020

Perspectivas

View full text Add to dashboard Cite

En la actualidad, las técnicas empíricas en la producción agrícola ecuatoriana para la identificación y clasificación de productos no son suficientes para alcanzar estándares de calidad con normas de inocuidad alimentaria y así lograr cubrir la demanda de un mercado internacional. Este trabajo presenta un sistema capaz de supervisar, identificar y clasificar la calidad de productos del sector agrícola, mediante la aplicación de técnicas de soft computing y algoritmos de machine learning que contribuyen a la identificación de imágenes en tiempo real. La investigación permitió implementar algoritmos de clasificación de K vecinos más cercanos para etiquetar los productos según su calidad y enviar los reportes en tiempo real a una aplicación web mediante el protocolo MQTT. Los productos utilizados para este estudio fueron bananas, naranjas, plátano verde y manzanas. Los resultados obtenidos permitieron determinar el mínimo número de imágenes requeridos para el entrenamiento de los modelos de identificación y las tasas de error de identificación durante la etapa de validación.

show abstract

Loss surface of XOR artificial neural networks

Cited by 24 publications

References 98 publications

Perspective: new insights from loss function landscapes of neural networks

Perspective: new insights from loss function landscapes of neural networks

Loss Surface Modality of Feed-Forward Neural Network Architectures

Clasificador de Productos Agrícolas para Control de Calidad basado en Machine Learning e Industria 4.0

Contact Info

Product

Resources

About