Amit Daniely scite author profile

We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution D on Q n × {±1}. We define Err HALF (D) as the least error of a halfspace classifier for D. A learner who can access D has to return a hypothesis whose error is small compared to Err HALF (D).Using the recently developed method of [33] we prove hardness of learning results assuming that random K-XOR formulas are hard to (strongly) refute. We show that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that Err HALF (D) ≤ η for arbitrarily small constant η > 0, and that D is supported in {±1} n × {±1}. Namely, even under these favorable conditions, and for every c > 0, it is hard to return a hypothesis with error ≤ 1 2 − 1 n c . In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where K can be poly-logarithmic in n), we can take η = 2 − log 1−ν (n) for arbitrarily small ν > 0. These results substantially improve on previously known results [31,38,50,51,44], that only show hardness of exact learning.A K-tuple is a mapping C : {±1} n → {±1} K in which each output coordinate is a literal and the K literals correspond to K different variables. The collection of K-tuples is denoted X n,K . A K-formula is a collection J = {C 1 , . . . , C m } of K-tuples. An instance to the K-XOR problem is a K-formula, and the goal is to find an assignment ψ ∈ {±1} n that maximizes VAL ψ,XOR (J) := |{j:XOR K (C i (ψ))=1}| m . We define the value of J as VAL XOR (J) := max ψ∈{±1} n VAL ψ,XOR (J). We will allow K to vary with n (but still be fixed for every n). For example, we can look of the log(n) -XOR problem.We will consider the problem of distinguishing random formulas from formulas with high value. Concretely, for m = m(n), K = K(n) and 1 2 > η = η(n) > 0, we say that the problem CSP rand,1−η m (XOR K ) is easy, if there exists an efficient randomized algorithm, A with the following properties. Its input is a K-formula J with n variables and m constraints and its output satisfies:Pr coins of A (A(J) = "non-random") ≥ 3 4 • If J is random 2 , then with probability 1 − o n (1) over the choice of J, Pr coins of A (A(J) = "random") ≥ 3 4 .It is not hard to see that the problem gets easier as m gets larger, and as η gets smaller. For η = 0, the problem is actually easy, as it can be solved using Gaussian elimination. However, even for slightly larger η's the problems seems to become hard. For example, for any constant η > 0, best known algorithms [37,25,26,11,3] only work with m = Ω n K 2 Pr j∼Uni([m]) ((C j ) A (ψ) = z) − 2 −|A| = C∈U Fr J (C) − 2 −|A| ≤ C∈U p n,|A| − 2 −|A| + |U |τ = |U |p n,|A| − 2 −|A| + |U | τ = |U | τ ≤ n |A| τ ≤ n t τ

show abstract

From average case complexity to improper learning complexity

Daniely

Linial

Shalev‐Shwartz

2014

View full text Add to dashboard Cite

The basic problem in the PAC model of computational learning theory is to determine which hypothesis classes are efficiently learnable. There is presently a dearth of results showing hardness of learning problems. Moreover, the existing lower bounds fall short of the best known algorithms.The biggest challenge in proving complexity results is to establish hardness of improper learning (a.k.a. representation independent learning). The difficulty in proving lower bounds for improper learning is that the standard reductions from NP-hard problems do not seem to apply in this context. There is essentially only one known approach to proving lower bounds on improper learning. It was initiated in [21] and relies on cryptographic assumptions.We introduce a new technique for proving hardness of improper learning, based on reductions from problems that are hard on average. We put forward a (fairly strong) generalization of Feige's assumption [13] about the complexity of refuting random constraint satisfaction problems. Combining this assumption with our new technique yields far reaching implications. In particular,• Learning DNF's is hard.• Agnostically learning halfspaces with a constant approximation ratio is hard.• Learning an intersection of ω(1) halfspaces is hard.

show abstract

Learning Economic Parameters from Revealed Preferences

Balcan

Daniely

Mehta

et al. 2014

View full text Add to dashboard Cite

A recent line of work, starting with Beigman and Vohra [3] and Zadimoghaddam and Roth [28], has addressed the problem of learning a utility function from revealed preference data. The goal here is to make use of past data describing the purchases of a utility maximizing agent when faced with certain prices and budget constraints in order to produce a hypothesis function that can accurately forecast the future behavior of the agent.In this work we advance this line of work by providing sample complexity guarantees and efficient algorithms for a number of important classes. By drawing a connection to recent advances in multi-class learning, we provide a computationally efficient algorithm with tight sample complexity guarantees (Θ(d/ǫ) for the case of d goods) for learning linear utility functions under a linear price model. This solves an open question in Zadimoghaddam and Roth [28]. Our technique yields numerous generalizations including the ability to learn other well-studied classes of utility functions, to deal with a misspecified model, and with non-linear prices.

show abstract

Depth Separation for Neural Networks

Daniely¹

2017

Preprint

View full text Add to dashboard Cite

Let f :We give a simple proof that shows that poly-size depth two neural networks with (exponentially) bounded weights cannot approximate f whenever g cannot be approximated by a low degree polynomial. Moreover, for many g's, such as g(x) = sin(πd 3 x), the number of neurons must be 2 Ω(d log(d)) . Furthermore, the result holds w.r.t. the uniform distribution on S d−1 × S d−1 . As many functions of the above form can be well approximated by poly-size depth three networks with poly-bounded weights, this establishes a separation between depth two and depth three networks w.r.t. the uniform distribution on S d−1 × S d−1 .

show abstract

Inapproximability of Truthful Mechanisms via Generalizations of the VC Dimension

Daniely

Schapira

Shahaf

2015

View full text Add to dashboard Cite

Algorithmic mechanism design (AMD) studies the delicate interplay between computational efficiency, truthfulness, and optimality. We focus on AMD's paradigmatic problem: combinatorial auctions. We present a new generalization of the VC dimension to multivalued collections of functions, which encompasses the classical VC dimension, Natarajan dimension, and Steele dimension. We present a corresponding generalization of the Sauer-Shelah Lemma and harness this VC machinery to establish inapproximability results for deterministic truthful mechanisms. Our results essentially unify all inapproximability results for deterministic truthful mechanisms for combinatorial auctions to date and establish new separation gaps between truthful and non-truthful algorithms. * It is highly recommended to prefer the full version of this paper that can be found on the authors' webpages and on arXiv.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Amit Daniely

Complexity theoretic limitations on learning halfspaces

From average case complexity to improper learning complexity

Learning Economic Parameters from Revealed Preferences

Depth Separation for Neural Networks

Inapproximability of Truthful Mechanisms via Generalizations of the VC Dimension

Contact Info

Product

Resources

About