Approximation of a function and its derivatives in feedforward neural networks

Basson, Etienne P.; Engelbrecht, Andries P.

doi:10.1109/ijcnn.1999.831531

Cited by 7 publications

(5 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For some low-dimensional cases, it allows deviations from targets to come close to the rounding error of single precision used during the training, thus addressing the gap between describing a function by an array of values and by a neural network. The concept of using derivatives for approximation [22] is quite common and was investigated for neural networks in numerous studies [23]- [29], however, the implementations of training in said papers included only low order derivatives and used somewhat small architectures, since the conditions of tests did not lead to precision gains of few orders of magnitude. Even though requirements for architectures of neural networks to approximate derivatives are usually modest [4], extra layers are sometimes necessary [30].…”

Section: Introductionmentioning

confidence: 99%

Enhancing Function Approximation Abilities of Neural Networks by Training Derivatives

Avrutskiy

2021

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

A method to increase the precision of feedforward networks is proposed. It requires a prior knowledge of a target function derivatives of several orders and uses this information in gradient based training. Forward pass calculates not only the values of the output layer of a network but also their derivatives. The deviations of those derivatives from the target ones are used in an extended cost function and then backward pass calculates the gradient of the extended cost with respect to weights, which can then be used by any weights update algorithm. Despite a substantial increase in arithmetic operations per pattern (if compared to the conventional training), the extended cost allows to obtain 140-1000 times more accurate approximation for simple cases if the total number of operations is equal. This precision also happens to be out of reach for the regular cost function. The method fits well into the procedure of solving differential equations with neural networks. Unlike training a network to match some target mapping, which requires an explicit use of the target derivatives in the extended cost function, the cost function for solving a differential equation is based on the deviation of the equation's residual from zero and thus can be extended by differentiating the equation itself, which does not require any prior knowledge. Solving an equation with such a cost resulted in 13 times more accurate result and could be done with 3 times larger grid step. GPU-efficient algorithm for calculating the gradient of the extended cost function is proposed.

show abstract

Section: Introductionmentioning

confidence: 99%

Enhancing Function Approximation Abilities of Neural Networks by Training Derivatives

Avrutskiy

2021

IEEE Trans. Neural Netw. Learning Syst.

View full text Add to dashboard Cite

show abstract

“…The second, less common approach is to enforce the condition D (N ) = D (O) with non-zero right side as an addition to the objective N = O. This idea has been implemented in [12,7,21,28,63] with moderate success. The author's paper [2] showed how this method can significantly increase the accuracy of the initial objective.…”

Section: Training Derivativesmentioning

confidence: 99%

Preventing Overfitting by Training Derivatives

Avrutskiy

2019

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

Derivative training is a well-known method to improve the accuracy of neural networks. In the forward pass, not only the output values are computed, but also their derivatives, and their deviations from the target derivatives are included in the cost function, which is minimized with respect to the weights by a gradient-based algorithm. So far, this method has been implemented for relatively low-dimensional tasks. In this study, we apply the approach to the problem of image analysis. We consider the task of reconstructing the vertices of a cube based on its image. By training the derivatives with respect to the 6 degrees of freedom of the cube, we obtain 25 times more accurate results for noiseless inputs. The derivatives also provide important insights into the robustness problem, which is currently understood in terms of two types of network vulnerabilities. The first type is small perturbations that dramatically change the output, and the second type is substantial image changes that the network erroneously ignores. They are currently considered as conflicting goals, since conventional training methods produce a trade-off. The first type can be analyzed via the gradient of the network, but the second type requires human evaluation of the inputs, which is an oracle substitute. For the task at hand, the nearest neighbor oracle can be defined, and the knowledge of derivatives allows it to be expanded into Taylor series. This allows to perform the first-order robustness analysis that unifies both types of vulnerabilities, and to implement robust training that eliminates any trade-offs, so that accuracy and robustness are limited only by network capacity.

show abstract

“…Now, using (9), (11), and (13), we can compute the elements of the gradient involving the weights using (8). For the elements of the gradient involving the bias terms, we can use…”

Section: A Gradient Calculationmentioning

confidence: 99%

“…In the second stage, the parameters are adjusted to improve the derivative approximation. In [11], the derivative fitting was performed by adding an extra output unit for each partial derivative to the regular structure of a feedforward neural network that was used to approximate the function. The standard backpropagation procedure was then used to train the proposed network structure.…”

Section: Introductionmentioning

confidence: 99%

Practical Training Framework for Fitting a Function and Its Derivatives

Pukrittayakamee¹,

Hagan

Raff

et al. 2011

IEEE Trans. Neural Netw.

View full text Add to dashboard Cite

This paper describes a practical framework for using multilayer feedforward neural networks to simultaneously fit both a function and its first derivatives. This framework involves two steps. The first step is to train the network to optimize a performance index, which includes both the error in fitting the function and the error in fitting the derivatives. The second step is to prune the network by removing neurons that cause overfitting and then to retrain it. This paper describes two novel types of overfitting that are only observed when simultaneously fitting both a function and its first derivatives. A new pruning algorithm is proposed to eliminate these types of overfitting. Experimental results show that the pruning algorithm successfully eliminates the overfitting and produces the smoothest responses and the best generalization among all the training algorithms that we have tested.

show abstract

Approximation of a function and its derivatives in feedforward neural networks

Cited by 7 publications

References 20 publications

Enhancing Function Approximation Abilities of Neural Networks by Training Derivatives

Enhancing Function Approximation Abilities of Neural Networks by Training Derivatives

Preventing Overfitting by Training Derivatives

Practical Training Framework for Fitting a Function and Its Derivatives

Contact Info

Product

Resources

About