2020
DOI: 10.32890/jict.20.1.2021.9267
|View full text |Cite
|
Sign up to set email alerts
|

Parametric Flatten-T Swish: An Adaptive Nonlinear Activation Function for Deep Learning

Abstract: QActivation function is a key component in deep learning that performs non-linear mappings between the inputs and outputs. Rectified Linear Unit (ReLU) has been the most popular activation function across the deep learning community. However, ReLU contains several shortcomings that can result in inefficient training of the deep neural networks, these are: 1) the negative cancellation property of ReLU tends to treat negative inputs as unimportant information for the learning, resulting in performance degradatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…Swish [29] is a nonpiecewise activation function based on Sigmoid function, which achieves better expressive power and faster convergence than ReLU. Parametric Flatten-T Swish (PFTS) [30] is an adaptive nonlinear activation function, which manifests higher nonlinear approximation power during training.…”
Section: Related Workmentioning
confidence: 99%
“…Swish [29] is a nonpiecewise activation function based on Sigmoid function, which achieves better expressive power and faster convergence than ReLU. Parametric Flatten-T Swish (PFTS) [30] is an adaptive nonlinear activation function, which manifests higher nonlinear approximation power during training.…”
Section: Related Workmentioning
confidence: 99%
“…A parametric flatted-T swish (PFTS) [501] is an adaptive extension of the FTS (see section 3.6.46); PFTS is identical to FTS except for that the parameter T is adaptive -i.e. :…”
Section: Parametric Flatted-t Swish (Pfts)mentioning
confidence: 99%
“…where T i is a trainable parameter for each neuron i [501]; the parameter T i is initialized to the value -0.20 [501].…”
Section: Parametric Flatted-t Swish (Pfts)mentioning
confidence: 99%