A Review of Activation Function for Artificial Neural Network

Rasamoelina, Andrinandrasana David; Adjailia, Fouzia; Sinčák, Peter

doi:10.1109/sami48414.2020.9108717

Cited by 183 publications

(84 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…We consider Assumptions 1-3 for all the three function classes. Assumption 2 holds for a subset of classification problems such as overparametrized neural network models with Sigmoid activation functions [33] and constrained optimization problems. The relaxation of such assumption is for undirected communication networks is studied in [29], and the extension to directed graphs remains an open question.…”

Section: Assumption 2 (Bounded Gradients) There Exists a Constantmentioning

confidence: 99%

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

Toghani¹,

Uribe²

2022

Preprint

View full text Add to dashboard Cite

We study the decentralized consensus and stochastic optimization problems with compressed communications over static directed graphs. We propose an iterative gradientbased algorithm that compresses messages according to a desired compression ratio. The proposed method provably reduces the communication overhead on the network at every communication round. Contrary to existing literature, we allow for arbitrary compression ratios in the communicated messages. We show a linear convergence rate for the proposed method on the consensus problem. Moreover, we provide explicit convergence rates for decentralized stochastic optimization problems on smooth functions that are either (i) strongly convex, (ii) convex, or (iii) non-convex. Finally, we provide numerical experiments to illustrate convergence under arbitrary compression ratios and the communication efficiency of our algorithm.

show abstract

Section: Assumption 2 (Bounded Gradients) There Exists a Constantmentioning

confidence: 99%

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

Toghani¹,

Uribe²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Abstracting the trick, we may interpret the rectification trick as the application of a nonlinear activation function g(t) = |t|: instead of analyzing f (t), one analyzes g(f (t)). Naturally, there are many other choices of the activation function [7], like the commonly used Rectified Linear Unit (ReLU) function g(t) = ReLU(f (t)) and other widely applied activation functions from the theory of neural networks that are at our disposal. In practice, ReLU seems to work and work roughly as well as the absolute value.…”

Section: The Rectification Trickmentioning

confidence: 99%

Fundamental component enhancement via adaptive nonlinear activation functions

Steinerberger¹,

Wu²

2021

Preprint

View full text Add to dashboard Cite

In many real world oscillatory signals, the fundamental component of a signal f (t) might be weak or does not exist. This makes it difficult to estimate the instantaneous frequency of the signal. Traditionally, researchers apply the rectification trick, working with |f (t)| or ReLu(f (t)) instead, to enhance the fundamental component. This raises an interesting question: what type of nonlinear function g : R → R has the property that g(f (t)) has a more pronounced fundamental frequency? g(t) = |t| and g(t) = ReLu(t) seem to work well in practice; we propose a variant of g(t) = 1/(1 − |t|) and provide a theoretical guarantee. Several simulated signals and real signals are analyzed to demonstrate the performance of the proposed solution.

show abstract

“…However, with years of practice in several fields, Sigmoid point of weakness in its small derivative that leads to vanishing gradient problem was accepted, and some other activation functions has explored and used instead, such as softmax, and ReLU. ReLU, for example, has a derivative of one for every positive input [10] .…”

Section: Activation Functions Usedmentioning

confidence: 99%

“…It is non-linear by nature and has a smooth derivative as shown in figure 5. Due to the output range of the sigmoid [0;1] the output of each unit is also squashed causing the gradient to vanish especially in a deep network [10].…”

Section: Sigmoidmentioning

confidence: 99%

See 1 more Smart Citation

Predicting customer complaints in Mobile Telecom Industry based on supervised machine learning

2021

jecet

View full text Add to dashboard Cite

The demand on telecom companies is still exponentially increasing on daily bases. In parallel, customer complaints from the services will have a similar curve. Most telecom companies rely on customer feedback to evaluate their network and services. In this thesis, we will take a Lebanese telecom company as a case study, and we will study the implementation of Machine Learning algorithms on it, using Artificial Neural Networks (ANN), comparing the feasibility of multiple optimizers and activation function to find the one that best suits our case. We are using a sample database of 10,000 mobile market subscribers with variables of gender, age, device manufacturer, service quality, and complaint status. We also propose the segmentedprediction model by window (interval time) and customer groups for better accuracy and practical usage. The customer group will have examined by gender, age, device manufacturer, and region area.

show abstract

A Review of Activation Function for Artificial Neural Network

Cited by 183 publications

References 8 publications

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks

Fundamental component enhancement via adaptive nonlinear activation functions

Predicting customer complaints in Mobile Telecom Industry based on supervised machine learning

Contact Info

Product

Resources

About