2019
DOI: 10.48550/arxiv.1912.07464
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Realization of spatial sparseness by deep ReLU nets with massive data

Abstract: The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality. The depth, structure, and massive size of the data are recognized to be three key ingredients for deep learning. Most of the recent theoretical studies for deep learning focus on the necessity and advantages of depth and structures of neural networks. In this paper, we aim at rigorous verification of the importance of massive data in embodying the out-performance of deep learning. To approximate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(4 citation statements)
references
References 47 publications
(103 reference statements)
0
4
0
Order By: Relevance
“…In particular, Part (b) of the following theorem gives a solution to the inverse problem of determining what smoothness class the target function belongs to near each point of X. In theory, this leads to a data-based detection of singularities and sparsity analogous to what is assumed in [6], but in much more general setting. Theorem 4.3.…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…In particular, Part (b) of the following theorem gives a solution to the inverse problem of determining what smoothness class the target function belongs to near each point of X. In theory, this leads to a data-based detection of singularities and sparsity analogous to what is assumed in [6], but in much more general setting. Theorem 4.3.…”
Section: Resultsmentioning
confidence: 99%
“…Our first theorem describes local function recovery using local sampling. We may interpret it in the spirit of distributed learning as in [6,25], where we are taking a linear combination of pre-fabricated networks G n using the function values themselves as the coefficients. The networks G n have the essentially the same localization property as the kernels Φ n (cf.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations