2021
DOI: 10.1038/s41598-021-96723-8
|View full text |Cite
|
Sign up to set email alerts
|

Universal activation function for machine learning

Abstract: This article proposes a universal activation function (UAF) that achieves near optimal performance in quantification, classification, and reinforcement learning (RL) problems. For any given problem, the gradient descent algorithms are able to evolve the UAF to a suitable activation function by tuning the UAF’s parameters. For the CIFAR-10 classification using the VGG-8 neural network, the UAF converges to the Mish like activation function, which has near optimal performance $$F_{1}=0.902\pm 0.004$$ … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(22 citation statements)
references
References 22 publications
0
22
0
Order By: Relevance
“…We used sigmoid as an activation function. The main reason why we use the sigmoid function is that the proposed model predicts probability as an output [ 23 ].…”
Section: Methodsmentioning
confidence: 99%
“…We used sigmoid as an activation function. The main reason why we use the sigmoid function is that the proposed model predicts probability as an output [ 23 ].…”
Section: Methodsmentioning
confidence: 99%
“…The precision value is above 0.8 for all activation functions except for the ReLu activation function (0.01), the highest precision value is achieved by the soft plus and UAF activation functions with a value of 0.902, the recall value is above 0.8 for all activation functions except the activation function. ReLu (0.1), the highest recall value was achieved by the softplus activation function and UAF with a value of 0.902, the F1 value was above 0.8 for all activation functions except the ReLu activation function (0.018) [9].…”
Section: Introductionmentioning
confidence: 91%
“…For example, two-section distributed-feedback (DFB) lasers 20 , vertical-cavity surface-emitting laser (VCSEL) 21 and disk lasers 22 have shown promising results, but they are bottlenecked by network scale, frequency of access and power consumption. Moreover, their nonlinear activation responses tend to be fixed during accelerator fabrication, but the nonlinear activation forms should be reprogrammed according to different ANN models and data sets 23 . Thus, as a complementary approach, a more straightforward and flexible implementation is attained by calculating the nonlinear functions in CPU, which connects physical photonic neural networks through electrical-to-optical (E/O) and optical-to-electrical (O/E) converters 24,25 .…”
Section: Introductionmentioning
confidence: 99%