We present efficient algorithms for releasing useful statistics about graph data while providing rigorous privacy guarantees. Our algorithms work on data sets that consist of relationships between individuals, such as social ties or email communication. The algorithms satisfy edge differential privacy, which essentially requires that the presence or absence of any particular relationship be hidden.Our algorithms output approximate answers to subgraph counting queries. Given a query graph H, e.g., a triangle, k-star or k-triangle, the goal is to return the number of edgeinduced isomorphic copies of H in the input graph. The special case of triangles was considered by Nissim, Raskhodnikova and Smith (STOC 2007), and a more general investigation of arbitrary query graphs was initiated by Rastogi, Hay, Miklau and Suciu (PODS 2009). We extend the approach of [NRS] to a new class of statistics, namely, k-star queries. We also give algorithms for k-triangle queries using a different approach, based on the higher-order local sensitivity. For the specific graph statistics we consider (i.e., kstars and k-triangles), we significantly improve on the work of [RHMS]: our algorithms satisfy a stronger notion of privacy, which does not rely on the adversary having a particular prior distribution on the data, and add less noise to the answers before releasing them.We evaluate the accuracy of our algorithms both theoretically and empirically, using a variety of real and synthetic data sets. We give explicit, simple conditions under which these algorithms add a small amount of noise. We also provide the average-case analysis in the Erdős-Rényi-Gilbert G(n, p) random graph model.Finally, we give hardness results indicating that the approach NRS used for triangles cannot easily be extended to k-triangles (and hence justifying our development of a new algorithmic approach).
We give algorithms for geometric graph problems in the modern parallel models such as MapReduce. For example, for the Minimum Spanning Tree (MST) problem over a set of points in the two-dimensional space, our algorithm computes a (1 + )-approximate MST. Our algorithms work in a constant number of rounds of communication, while using total space and communication proportional to the size of the data (linear space and near linear time algorithms). In contrast, for general graphs, achieving the same result for MST (or even connectivity) remains a challenging open problem [9], despite drawing significant attention in recent years.We develop a general algorithmic framework that, besides MST, also applies to Earth-Mover Distance (EMD) and the transportation cost problem. Our algorithmic framework has implications beyond the MapReduce model. For example it yields a new algorithm for computing EMD cost in the plane in near-linear time, n 1+o (1) . We note that while recently [33] have developed a near-linear time algorithm for (1 + )-approximating EMD, our algorithm is fundamentally different, and, for example, also solves the transportation (cost) problem, raised as an open question in [33]. Furthermore, our algorithm immediately gives a (1+ )-approximation algorithm with n δ space in the streamingwith-sorting model with 1/δ O(1) passes. As such, it is tempting to conjecture that the parallel models may also constitute a concrete playground in the quest for efficient algorithms for EMD (and other similar problems) in the vanilla streaming model, a well-known open problem.
We give new rounding schemes for the standard linear programming relaxation of the correlation clustering problem, achieving approximation factors almost matching the integrality gaps:• For complete graphs our approximation is 2.06 − ε, which almost matches the previously known integrality gap of 2.• For complete k-partite graphs our approximation is 3. We also show a matching integrality gap.• For complete graphs with edge weights satisfying triangle inequalities and probability constraints, our approximation is 1.5, and we show an integrality gap of 1.2.Our results improve a long line of work on approximation algorithms for correlation clustering in complete graphs, previously culminating in a ratio of 2.5 for the complete case by Ailon, Charikar and Newman (JACM'08). In the weighted complete case satisfying triangle inequalities and probability constraints, the same authors give a 2-approximation; for the bipartite case, Ailon, Avigdor-Elgrabli, Liberty and van Zuylen give a 4-approximation (SICOMP'12).
We study the problem of finding an approximate maximum matching in two closely related computational models, namely, the dynamic graph streaming model and the simultaneous multi-party communication model. In the dynamic graph streaming model, the input graph is revealed as a stream of edge insertions and deletions, and the goal is to design a small space algorithm to approximate the maximum matching. In the simultaneous model, the input graph is partitioned across k players, and the goal is to design a protocol where the k players simultaneously send a small-size message to a coordinator, and the coordinator computes an approximate matching. Dynamic graph streams. We resolve the space complexity of single-pass turnstile streaming algorithms for approximating matchings by showing that for any > 0, Θ(n 2−3) space is both sufficient and necessary (up to polylogarithmic factors) to compute an n-approximate matching; here n denotes the number of vertices in the input graph. The simultaneous communication model. Our results for dynamic graph streams also resolve the (per-player) simultaneous communication complexity for approximating matchings in the edge partition model. For the vertex partition model, we design new randomized and deterministic protocols for k players to achieve an α-approximation. Specifically, for α ≥ √ k, we provide a randomized protocol with total communication of O(nk/α 2) and a deterministic protocol with total communication of O(nk/α). Both these bounds are tight. Our work generalizes the results established by Dobzinski et al. (STOC 2014) for the special case of k = n. Finally, for the case of α = o(√ k), we establish a new lower bound on the simultaneous communication complexity which is super-linear in n.
We present an O ( √ n log n)-approximation algorithm for the problem of finding the sparsest spanner of a given directed graph G on n vertices. A spanner of a graph is a sparse subgraph that approximately preserves distances in the original graph. More precisely, given a graph G = (V , E) with nonnegative edge lengths d : E → R 0 and a stretch k 1,The previous best approximation ratio wasÕ (n 2/3 ), due to Dinitz and Krauthgamer (STOC '11).We also improve the approximation ratio for the important special case of directed 3-spanners with unit edge lengths fromÕ ( √ n ) to O (n 1/3 log n). The best previously known algorithms for this problem are due to Berman, Raskhodnikova and Ruan (FSTTCS '10) and Dinitz and Krauthgamer. The approximation ratio of our algorithm almost matches Dinitz and Krauthgamer's lower bound for the integrality gap of a natural linear programming relaxation. Our algorithm directly implies an O (n 1/3 log n)-approximation for the 3-spanner problem on undirected graphs with unit lengths. An easy O ( √ n )-approximation algorithm for this problem has been the best known for decades. Finally, we consider the Directed Steiner Forest problem: given a directed graph with edge costs and a collection of ordered vertex pairs, find a minimum-cost subgraph that contains a path between every prescribed pair. We obtain an approximation ratio of O (n 2/3+ ) for any constant > 0, which improves the O (n · min(n 4/5 , m 2/3 )) ratio due to Feldman, Kortsarz and Nutov (JCSS'12).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.