Accelerating Extreme Search Based on Natural Gradient Descent with Beta Distribution

Lyakhov, Pavel; Abdulkadirov, Ruslan

doi:10.1109/ent50460.2021.9681769

Cited by 3 publications

(5 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We verified the capability of the proposed algorithm to converge in the neighborhood of the global minimum in the case of the Rastrigin and Rosenbrock functions, where known algorithms do not achieve the global minimum. Such experiments differ from the experiments in [14,15].…”

Section: Discussioncontrasting

confidence: 65%

“…In [14], which is a continuation of [15], we explored the natural gradient descent based on Dirichlet distribution. In this research, we added and calculated the Fisher information matrix of the generalized Dirichlet distribution.…”

Section: Discussionmentioning

confidence: 99%

“…The beta distribution is the Dirichlet and generalized Dirichlet distribution in threedimensional Euclidean space. Hence, the Fisher matrix of beta distribution [15] is…”

Section: Three-dimensional Casementioning

confidence: 99%

“…We demonstrate that the natural gradient descent with step-size adaptation based on Dirichlet and generalized Dirichlet distributions has higher accuracy and does not take a large number of iterations for minimizing test functions compared to gradient descent and Adam. Such an approach is a continuation of works [14,15], where the final results did not present the ability of natural gradient descent to converge in the neighborhood of the global minimum.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions

2022

View full text Add to dashboard Cite

The high accuracy attainment, using less complex architectures of neural networks, remains one of the most important problems in machine learning. In many studies, increasing the quality of recognition and prediction is obtained by extending neural networks with usual or special neurons, which significantly increases the time of training. However, engaging an optimization algorithm, which gives us a value of the loss function in the neighborhood of global minimum, can reduce the number of layers and epochs. In this work, we explore the extreme searching of multidimensional functions by proposed natural gradient descent based on Dirichlet and generalized Dirichlet distributions. The natural gradient is based on describing a multidimensional surface with probability distributions, which allows us to reduce the change in the accuracy of gradient and step size. The proposed algorithm is equipped with step-size adaptation, which allows it to obtain higher accuracy, taking a small number of iterations in the process of minimization, compared with the usual gradient descent and adaptive moment estimate. We provide experiments on test functions in four- and three-dimensional spaces, where natural gradient descent proves its ability to converge in the neighborhood of global minimum. Such an approach can find its application in minimizing the loss function in various types of neural networks, such as convolution, recurrent, spiking and quantum networks.

show abstract

Section: Discussioncontrasting

confidence: 65%

Section: Discussionmentioning

confidence: 99%

“…The beta distribution is the Dirichlet and generalized Dirichlet distribution in threedimensional Euclidean space. Hence, the Fisher matrix of beta distribution [15] is…”

Section: Three-dimensional Casementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions

2022

View full text Add to dashboard Cite

show abstract

“…However, by selecting appropriate probability distribution, such as Gauss and Dirichlet, we can reduce the variable θ in the Fisher information matrix, which makes it possible to avoid its calculation in every iteration. Such approach is realized in [113][114][115][116]. The natural gradient descent can replace second-order optimization algorithms due to convergence rate and time consumption.…”

Section: Probability Density Functionmentioning

confidence: 99%

Survey of Optimization Algorithms in Modern Neural Networks

2023

Self Cite

View full text Add to dashboard Cite

The main goal of machine learning is the creation of self-learning algorithms in many areas of human activity. It allows a replacement of a person with artificial intelligence in seeking to expand production. The theory of artificial neural networks, which have already replaced humans in many problems, remains the most well-utilized branch of machine learning. Thus, one must select appropriate neural network architectures, data processing, and advanced applied mathematics tools. A common challenge for these networks is achieving the highest accuracy in a short time. This problem is solved by modifying networks and improving data pre-processing, where accuracy increases along with training time. Bt using optimization methods, one can improve the accuracy without increasing the time. In this review, we consider all existing optimization algorithms that meet in neural networks. We present modifications of optimization algorithms of the first, second, and information-geometric order, which are related to information geometry for Fisher–Rao and Bregman metrics. These optimizers have significantly influenced the development of neural networks through geometric and probabilistic tools. We present applications of all the given optimization algorithms, considering the types of neural networks. After that, we show ways to develop optimization algorithms in further research using modern neural networks. Fractional order, bilevel, and gradient-free optimizers can replace classical gradient-based optimizers. Such approaches are induced in graph, spiking, complex-valued, quantum, and wavelet neural networks. Besides pattern recognition, time series prediction, and object detection, there are many other applications in machine learning: quantum computations, partial differential, and integrodifferential equations, and stochastic processes.

show abstract

Improving Extreme Search with Natural Gradient Descent Using Dirichlet Distribution

Abdulkadirov

Lyakhov

2022

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Accelerating Extreme Search Based on Natural Gradient Descent with Beta Distribution

Cited by 3 publications

References 4 publications

Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions

Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions

Survey of Optimization Algorithms in Modern Neural Networks

Improving Extreme Search with Natural Gradient Descent Using Dirichlet Distribution

Contact Info

Product

Resources

About