2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI) 2020
DOI: 10.1109/sami48414.2020.9108717
|View full text |Cite
|
Sign up to set email alerts
|

A Review of Activation Function for Artificial Neural Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
83
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 183 publications
(84 citation statements)
references
References 8 publications
0
83
0
1
Order By: Relevance
“…We consider Assumptions 1-3 for all the three function classes. Assumption 2 holds for a subset of classification problems such as overparametrized neural network models with Sigmoid activation functions [33] and constrained optimization problems. The relaxation of such assumption is for undirected communication networks is studied in [29], and the extension to directed graphs remains an open question.…”
Section: Assumption 2 (Bounded Gradients) There Exists a Constantmentioning
confidence: 99%
“…We consider Assumptions 1-3 for all the three function classes. Assumption 2 holds for a subset of classification problems such as overparametrized neural network models with Sigmoid activation functions [33] and constrained optimization problems. The relaxation of such assumption is for undirected communication networks is studied in [29], and the extension to directed graphs remains an open question.…”
Section: Assumption 2 (Bounded Gradients) There Exists a Constantmentioning
confidence: 99%
“…Abstracting the trick, we may interpret the rectification trick as the application of a nonlinear activation function g(t) = |t|: instead of analyzing f (t), one analyzes g(f (t)). Naturally, there are many other choices of the activation function [7], like the commonly used Rectified Linear Unit (ReLU) function g(t) = ReLU(f (t)) and other widely applied activation functions from the theory of neural networks that are at our disposal. In practice, ReLU seems to work and work roughly as well as the absolute value.…”
Section: The Rectification Trickmentioning
confidence: 99%
“…However, with years of practice in several fields, Sigmoid point of weakness in its small derivative that leads to vanishing gradient problem was accepted, and some other activation functions has explored and used instead, such as softmax, and ReLU. ReLU, for example, has a derivative of one for every positive input [10] .…”
Section: Activation Functions Usedmentioning
confidence: 99%
“…It is non-linear by nature and has a smooth derivative as shown in figure 5. Due to the output range of the sigmoid [0;1] the output of each unit is also squashed causing the gradient to vanish especially in a deep network [10].…”
Section: Sigmoidmentioning
confidence: 99%
See 1 more Smart Citation