Cartoon Approximation with $$\alpha $$ α -Curvelets

Keiper, Sandra; Kutyniok, Gitta; Schäfer, Martin

doi:10.1007/s00041-015-9446-6

Cited by 21 publications

(63 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If C is a bounded subset of B s p,q (R d ), then we have γ * (C) = s/d. In the present paper, we shall be particularly interested in so-called β-cartoon-like functions, for which the optimal exponent is given by β/2 (see [18,26] and Theorem 6.3).…”

Section: Min-max Rate Distortion Theorymentioning

confidence: 99%

See 1 more Smart Citation

Optimal Approximation with Sparsely Connected Deep Neural Networks

Bölcskei¹,

Grohs²,

Kutyniok³

et al. 2019

SIAM Journal on Mathematics of Data Science

Self Cite

209

194

View full text Add to dashboard Cite

We derive fundamental lower bounds on the connectivity and the memory requirements of deep neural networks guaranteeing uniform approximation rates for arbitrary function classes in L 2 (R d ). In other words, we establish a connection between the complexity of a function class and the complexity of deep neural networks approximating functions from this class to within a prescribed accuracy. Additionally, we prove that our lower bounds are achievable for a broad family of function classes. Specifically, all function classes that are optimally approximated by a general class of representation systems-so-called affine systems-can be approximated by deep neural networks with minimal connectivity and memory requirements. Affine systems encompass a wealth of representation systems from applied harmonic analysis such as wavelets, ridgelets, curvelets, shearlets, α-shearlets, and more generally α-molecules. Our central result elucidates a remarkable universality property of neural networks and shows that they achieve the optimum approximation properties of all affine systems combined. As a specific example, we consider the class of α −1 -cartoon-like functions, which is approximated optimally by α-shearlets. We also explain how our results can be extended to the case of functions on low-dimensional immersed manifolds. Finally, we present numerical experiments demonstrating that the standard stochastic gradient descent algorithm generates deep neural networks providing close-to-optimal approximation rates. Moreover, these results indicate that stochastic gradient descent can actually learn approximations that are sparse in the representation systems optimally sparsifying the function class the network is trained on.Throughout the paper, we consider the case Φ : R d → R, i.e., N L = 1, which includes situations such as the classification and temperature prediction problem described above. We emphasize, however, that the general results of Sections 3, 4, and 5 are readily generalized to N L > 1.We denote the class of networks Φ : R d → R with exactly L layers, connectivity no more than M , and activation function ρ by NN L,M,d,ρ with the understanding that for L = 1, the set NN L,M,d,ρ is empty. Moreover, we let NN ∞,M,d,ρ := L∈N NN L,M,d,ρ , NN L,∞,d,ρ := M ∈N NN L,M,d,ρ , NN ∞,∞,d,ρ := L∈N NN L,∞,d,ρ .Now, given a function f : R d → R, we are interested in the theoretically best possible approximation of f by a network Φ ∈ NN ∞,M,d,ρ . Specifically, we will want to know how the approximation quality depends on the connectivity M and what the associated number of bits needed to store the network topology 7 i=1 c i f (· − d i ) is compactly supported, has 7 vanishing moments in x 1 -direction, andĝ(ξ) = 0 for all ξ ∈ [−3, 3] 2 such that ξ 1 = 0. Then, by Theorem 6.4 and Remark 6.7 there exists δ > 0 such that SH α (f, g, δ; Ω) is optimal for E 1/α (Ω; ν). We definewhere we order (A j ) j∈N such that |det(A j )| ≤ |det(A j+1 )|, for all j ∈ N. This construction implies that the α-shearlet system SH α (f, g, δ; Ω) is an affi...

show abstract

Section: Min-max Rate Distortion Theorymentioning

confidence: 99%

“…The optimal exponent γ * (E β (R 2 ; ν)) was found in [18,26]: Theorem 6.3. For β ∈ [1,2], and ν > 0, we have…”

Section: α-Shearlets and Cartoon-like Functionsmentioning

confidence: 99%

Optimal Approximation with Sparsely Connected Deep Neural Networks

Bölcskei¹,

Grohs²,

Kutyniok³

et al. 2019

SIAM Journal on Mathematics of Data Science

Self Cite

209

194

View full text Add to dashboard Cite

show abstract

“…ω ∈ R d , is monotonically increasing in |ω|, for s > 0, -the space C K CART of cartoon functions of size K, introduced in [35], and widely used in the mathematical signal processing literature [15], [19], [26], [36], [37] as a model for natural images such as, e.g., images of handwritten digits [38] (see Figure 4). For a formal definition of C K CART , we refer the reader to Appendix B, where we also show that C K CART ⊆ H s (R d ), for K > 0 and s ∈ (0, 1/2).…”

Section: Energy Decay and Trivial Null-setmentioning

confidence: 99%

“…We note that d-dimensional uniform covering filters as introduced in [11] are functions whose Fourier transforms' support sets can be covered by a union of finitely many balls. This covering condition is satisfied by, e.g., Weyl-Heisenberg filters [21] with a bandlimited prototype function, but fails to hold for multi-scale filters such as wavelets [22], [23], (α)-curvelets [24]- [26], shearlets [27], [28], or ridgelets [29]- [31], see [11,Remark 2.2 (b)].…”

Section: Introductionmentioning

confidence: 99%

“…The first canonical orthant is H := {x ∈ 1 A wide range of practically relevant signal classes are Sobolev functions, for example, band-limited functions and-as established in the present paper-cartoon functions [35]. We note that cartoon functions are widely used in the mathematical signal processing literature [15], [19], [26], [36], [37] as a model for natural images such as, e.g., images of handwritten digits [38]. R d | x k ≥ 0, k = 1,..., d}, and we define the rotated orthant…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Energy Propagation in Deep Convolutional Neural Networks

Wiatowski

Bölcskei

2018

IEEE Trans. Inform. Theory

Self Cite

View full text Add to dashboard Cite

Many practical machine learning tasks employ very deep convolutional neural networks. Such large depths pose formidable computational challenges in training and operating the network. It is therefore important to understand how fast the energy contained in the propagated signals (a.k.a. feature maps) decays across layers. In addition, it is desirable that the feature extractor generated by the network be informative in the sense of the only signal mapping to the all-zeros feature vector being the zero input signal. This "trivial null-set" property can be accomplished by asking for "energy conservation" in the sense of the energy in the feature vector being proportional to that of the corresponding input signal. This paper establishes conditions for energy conservation (and thus for a trivial null-set) for a wide class of deep convolutional neural network-based feature extractors and characterizes corresponding feature map energy decay rates. Specifically, we consider general scattering networks employing the modulus non-linearity and we find that under mild analyticity and high-pass conditions on the filters (which encompass, inter alia, various constructions of Weyl-Heisenberg filters, wavelets, ridgelets, (α)-curvelets, and shearlets) the feature map energy decays at least polynomially fast. For broad families of wavelets and Weyl-Heisenberg filters, the guaranteed decay rate is shown to be exponential. Moreover, we provide handy estimates of the number of layers needed to have at least ((1 − ε) · 100)% of the input signal energy be contained in the feature vector.Index Terms-Machine learning, deep convolutional neural networks, scattering networks, energy decay and conservation, frame theory.

show abstract

Shearlets: From Theory to Deep Learning

Kutyniok

2021

Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging

View full text Add to dashboard Cite

Cartoon Approximation with $$\alpha $$ α -Curvelets

Cited by 21 publications

References 13 publications

Optimal Approximation with Sparsely Connected Deep Neural Networks

Optimal Approximation with Sparsely Connected Deep Neural Networks

Energy Propagation in Deep Convolutional Neural Networks

Shearlets: From Theory to Deep Learning

Contact Info

Product

Resources

About