2019
DOI: 10.48550/arxiv.1910.07476
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation

Elisa Oostwal,
Michiel Straat,
Michael Biehl

Abstract: We study layered neural networks of rectified linear units (ReLU) in a modelling framework for stochastic training processes. The comparison with sigmoidal activation functions is in the center of interest. We compute typical learning curves for shallow networks with K hidden units in matching student teacher scenarios. The systems exhibit sudden changes of the generalization performance via the process of hidden unit specialization at critical sizes of the training set. Surprisingly, our results show that the… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 28 publications
0
1
0
Order By: Relevance
“…2). Phase change behavior of dynamical systems using the sigmoid and RELU activation functions are known in the literature in the context of generalization performance of deep neural networks C ¸akmak andOpper [2020],Oostwal et al [2019]. In this section we present a complete proof of the bifurcation analysis of non-linear dynamical systems involving sigmoid activation function despite its connections with C ¸akmak andOpper [2020],Oostwal et al [2019].…”
mentioning
confidence: 99%
“…2). Phase change behavior of dynamical systems using the sigmoid and RELU activation functions are known in the literature in the context of generalization performance of deep neural networks C ¸akmak andOpper [2020],Oostwal et al [2019]. In this section we present a complete proof of the bifurcation analysis of non-linear dynamical systems involving sigmoid activation function despite its connections with C ¸akmak andOpper [2020],Oostwal et al [2019].…”
mentioning
confidence: 99%