2019
DOI: 10.1016/j.neucom.2019.08.065
|View full text |Cite
|
Sign up to set email alerts
|

A simple and efficient architecture for trainable activation functions

Abstract: Learning automatically the best activation function for the task is an active topic in neural network research. At the moment, despite promising results, it is still difficult to determine a method for learning an activation function that is at the same time theoretically simple and easy to implement. Moreover, most of the methods proposed so far introduce new parameters or adopt different learning techniques. In this work we propose a simple method to obtain trained activation function which adds to the neura… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(13 citation statements)
references
References 30 publications
0
12
0
1
Order By: Relevance
“…Work towards the enhancement of activation functions in the neural networks was also proposed, such as Variable Activation Function (VAF) [17] and Adaptive Takagi-Sugeno-Kang (AdaTSK) [23]. Apart from those adaptive action functions, [15] presented a proposition of two-layer mixture of factor analysers with joint factor loading (2L-MJFA) for conducting the dimensionality reduction and classification together.…”
Section: Machine Learning For Digital Healthcarementioning
confidence: 99%
See 1 more Smart Citation
“…Work towards the enhancement of activation functions in the neural networks was also proposed, such as Variable Activation Function (VAF) [17] and Adaptive Takagi-Sugeno-Kang (AdaTSK) [23]. Apart from those adaptive action functions, [15] presented a proposition of two-layer mixture of factor analysers with joint factor loading (2L-MJFA) for conducting the dimensionality reduction and classification together.…”
Section: Machine Learning For Digital Healthcarementioning
confidence: 99%
“…Recently, pervasive healthcare becomes the central topic which attracts intensive attentions and interests from academia, industry, as well as healthcare sectors [10,11,12,13,14,15,16,17]. In this problem domain, highly class-imbalanced data set with a large number of missing values are common problems [18].…”
Section: Introductionmentioning
confidence: 99%
“…Variable Activation Function. In (Apicella et al, 2019) trainable activation functions are expressed in terms of sub-networks with only one hidden layer, relying on the consideration that a one-hidden layer neural network can approximate arbitrarily well any continuous functional mapping from one finite-dimensional space to another, enabling the resulting function to assume "any" shape. In a nutshell, the proposed activation is modelled as a non-linear activation function f with a neuron with an Identity activation function which sends its output to a one-hidden-layer sub-network with just one output neuron having, in turn, an Identity as an output function.…”
Section: Linear Combination Of One-to-one Functionsmentioning
confidence: 99%
“…In other words, the key idea is to involve the activation functions in the learning process together (or separately) with the other parameters of the network such as weights and biases, thus obtaining a trained activation function. In the literature we usually find the expression "trainable activation functions", however the expressions "learneable", "adaptive" or "adaptable" activation functions are also used, see, for example, (Scardapane et al, 2018;Apicella et al, 2019;Qian et al, 2018). Many and heterogeneous trainable activation function models have been proposed in the literature, and in recent years there is a particular interest in this topic, see Figure 1.…”
Section: Introductionmentioning
confidence: 99%
“…Even though transfer learning (i.e., the use of pre-trained CNNs on large-scale datasets of natural images) could be applied, hundreds of accurately annotated input samples should be available [28]. Therefore, parameter-efficient architectures, including simple trainable activation functions [29] or mixed-scale dense CNNs [30], might be beneficial to deal with the paucity of manually labeled and validated datasets. Alternatively, also data augmentation techniques based on Generative Adversarial Networks (GANs) [31,32] or interactive solutions [33], require time-consuming annotation by experts.…”
Section: Introductionmentioning
confidence: 99%