We give new rounding schemes for the standard linear programming relaxation of the correlation clustering problem, achieving approximation factors almost matching the integrality gaps:• For complete graphs our approximation is 2.06 − ε, which almost matches the previously known integrality gap of 2.• For complete k-partite graphs our approximation is 3. We also show a matching integrality gap.• For complete graphs with edge weights satisfying triangle inequalities and probability constraints, our approximation is 1.5, and we show an integrality gap of 1.2.Our results improve a long line of work on approximation algorithms for correlation clustering in complete graphs, previously culminating in a ratio of 2.5 for the complete case by Ailon, Charikar and Newman (JACM'08). In the weighted complete case satisfying triangle inequalities and probability constraints, the same authors give a 2-approximation; for the bipartite case, Ailon, Avigdor-Elgrabli, Liberty and van Zuylen give a 4-approximation (SICOMP'12).
Random constraint satisfaction problems (CSPs) are known to exhibit threshold phenomena: given a uniformly random instance of a CSP with n variables and m clauses, there is a value of m = Ω(n) beyond which the CSP will be unsatisfiable with high probability. Strong refutation is the problem of certifying that no variable assignment satisfies more than a constant fraction of clauses; this is the natural algorithmic problem in the unsatisfiable regime (when m/n = ω(1)).Intuitively, strong refutation should become easier as the clause density m/n grows, because the contradictions introduced by the random clauses become more locally apparent. For CSPs such as k-SAT and k-XOR, there is a long-standing gap between the clause density at which efficient strong refutation algorithms are known, m/n ≥ O(n k/2−1 ), and the clause density at which instances become unsatisfiable with high probability, m/n = ω(1).In this paper, we give spectral and sum-of-squares algorithms for strongly refuting random k-XOR instances with clause density m/n ≥ O(n (k/2−1)(1−δ) ) in time exp( O(n δ )) or in O(n δ ) rounds of the sum-of-squares hierarchy, for any δ ∈ [0, 1) and any integer k ≥ 3. Our algorithms provide a smooth transition between the clause density at which polynomial-time algorithms are known at δ = 0, and brute-force refutation at the satisfiability threshold when δ = 1. We also leverage our k-XOR results to obtain strong refutation algorithms for SAT (or any other Boolean CSP) at similar clause densities.Our algorithms match the known sum-of-squares lower bounds due to Grigoriev and Schonebeck, up to logarithmic factors.First we will bound the value of term in (3.9). Recall that by Lemma 3.7, if E is even then the number of distinct labels in V ∈ V is less than E 0 . Therefore,Now, we will use the following claim:Proof. If H σ is diagonal-free and even, then we claim that each pivot value appears twice. Suppose not, if σ i is such that σ i = σ j for all j = i. Since H σ is diagonal free, the two hyperedges involving σ i are distinct. Since this is the unique occurrence of these two hyperedges in H σ , H σ cannot be even-a contradiction. With each pivot appearing at least twice, the number of distinct choices of σ is at most (2d )!n d .
We consider two problems that arise in machine learning applications: the problem of recovering a planted sparse vector in a random linear subspace and the problem of decomposing a random low-rank overcomplete 3-tensor. For both problems, the best known guarantees are based on the sum-of-squares method. We develop new algorithms inspired by analyses of the sum-of-squares method. Our algorithms achieve the same or similar guarantees as sum-ofsquares for these problems but the running time is significantly faster.For the planted sparse vector problem, we give an algorithm with running time nearly linear in the input size that approximately recovers a planted sparse vector with up to constant relative sparsity in a random subspace of R n of dimension up toΩ( √ n). These recovery guarantees match the best known ones of Barak, Kelner, and Steurer (STOC 2014) up to logarithmic factors.For tensor decomposition, we give an algorithm with running time close to linear in the input size (with exponent ≈ 1.086) that approximately recovers a component of a random 3-tensor over R n of rank up toΩ(n 4/3 ). The best previous algorithm for this problem due to Ge and Ma (RANDOM 2015) works up to rankΩ(n 3/2 ) but requires quasipolynomial time.
The stochastic block model is a classical cluster-exhibiting random graph model that has been widely studied in statistics, physics and computer science. In its simplest form, the model is a random graph with two equal-sized clusters, with intra-cluster edge probability p, and intercluster edge probability q. We focus on the sparse case, i.e., p, q = O(1/n), which is practically more relevant and also mathematically more challenging. A conjecture of Decelle, Krzakala, Moore and Zdeborová, based on ideas from statistical physics, predicted a specific threshold for clustering. The negative direction of the conjecture was proved by Mossel, Neeman and Sly (2012), and more recently the positive direction was proven independently by Massoulié and Mossel, Neeman, and Sly.In many real network clustering problems, nodes contain information as well. We study the interplay between node and network information in clustering by studying a labeled block model, where in addition to the edge information, the true cluster labels of a small fraction of the nodes are revealed. In the case of two clusters, we show that below the threshold, a small amount of node information does not affect recovery. On the other hand, we show that for any small amount of information efficient local clustering is achievable as long as the number of clusters is sufficiently large (as a function of the amount of revealed information).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.