“…Since in a one-hidden-layer ReLU network (without bias term), F (0) = 0, and furthermore F is a 2kR-Lipschitz continuous function, we conclude that |F (x)| 2kR • x 2 for ∀x ∈ R d . Next, we can apply the proof of Lemma A.1 of [CKM21] to show that G(x) := F (x) 2 − µ is a zero-centered, sub-exponential random variable with sub-exponential norm G Ψ 1 = O(µ + 4R 2 k 3 ). Finally, by using the concentration property of sub-exponential random variables, we conclude that:…”