Best k-Layer Neural Network Approximations

Lim, Lek-Heng; Michałek, Mateusz; Yang, Qi

doi:10.1007/s00365-021-09545-2

Cited by 3 publications

(3 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[12,Lemma 27.3]) such that there exists a network configuration achieving zero error and, thus, a global minimum in the search space. For shallow feedforward ANNs using ReLU activation it has been shown that also in the underparametrized regime there exists a global minimum if the ANN has a one-dimensional output [18], whereas there are pathological counterexamples in higher dimensions [19]. However, for general measures µ not necessarily consisting of a finite number of Dirac measures, the literature on the existence of global minima is very limited.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

On the Existence of Optimal Shallow Feedforward Networks with ReLU Activation

Steffen Dereich,

Sebastian Kassing

2024

JML

View full text Add to dashboard Cite

We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation. This property is one of the fundamental artifacts separating ReLU from other commonly used activation functions. We propose a kind of closure of the search space so that in the extended space minimizers exist. In a second step, we show under mild assumptions that the newly added functions in the extension perform worse than appropriate representable ReLU networks. This then implies that the optimal response in the extended target space is indeed the response of a ReLU network.

show abstract

Section: Introductionmentioning

confidence: 99%

“…This phenomenon can also be observed in empirical risk minimization for the hyperbolic tangent activation. As shown in [19], in the underparametrized setting, there exist input data such that for all output data from a set of positive Lebesgue measure there does not exist minimizers in the optimization landscape.…”

Section: Introductionmentioning

confidence: 99%

On the Existence of Optimal Shallow Feedforward Networks with ReLU Activation

Steffen Dereich,

Sebastian Kassing

2024

JML

View full text Add to dashboard Cite

show abstract

“…This phenomenon can also be observed in empirical risk minimization for the hyperbolic tangent activation. As shown in [LMQ22], in the underparametrized setting, there exist input data such that for all output data from a set of positive Lebesgue measure there does not exist minimizers in the optimization landscape. It remains an open problem whether for ReLU activation this phenomenon prevails.…”

Section: Introductionmentioning

confidence: 99%