role mining, role-based access controlWe describe several bottom-up approaches to problems in role engineering for Role-Based Access Control (RBAC). The salient problems are all NP-complete, even to approximate, yet we find that in instances that arise in practice these problems can be solved in minutes. We first consider role minimization, the process of finding a smallest collection of roles that can be used to implement a pre-existing user-topermission relation. We introduce fast graph reductions that allow recovery of the solution from the solution to a problem on a input graph. For our test cases, these reductions either solve the problem, or reduce the problem enough that we find the optimum solution with a (worst-case) exponential method. We introduce lower bounds that are sharp for seven of nine test cases and are within 3.4% on the other two. We introduce and test a new polynomial-time approximation that on average yields 2% more roles than the optimum. We next consider the related problem of minimizing the number of connections between roles and users or permissions, and we develop effective heuristic methods for this problem as well. Finally, we propose methods for several related problems. Internal Accession Date OnlyApproved for External Publication
Clustering problems have numerous applications and are becoming more challenging as the size of the data increases. In this paper, we consider designing clustering algorithms that can be used in MapReduce, the most popular programming environment for processing large datasets. We focus on the practical and popular clustering problems, k-center and k-median. We develop fast clustering algorithms with constant factor approximation guarantees. From a theoretical perspective, we give the first analysis that shows several clustering algorithms are in MRC 0 , a theoretical MapReduce class introduced by Karloff et al. [26]. Our algorithms use sampling to decrease the data size and they run a time consuming clustering algorithm such as local search or Lloyd's algorithm on the resulting data set. Our algorithms have sufficient flexibility to be used in practice since they run in a constant number of MapReduce rounds. We complement these results by performing experiments using our algorithms. We compare the empirical performance of our algorithms to several sequential and parallel algorithms for the k-median problem. The experiments show that our algorithms' solutions are similar to or better than the other algorithms' solutions. Furthermore, on data sets that are sufficiently large, our algorithms are faster than the other parallel algorithms that we tested. which renders sequential algorithms unusable. In situations where the amount of data is prohibitively large, the MapReduce [16] programming paradigm is used to overcome this obstacle. MapReduce and its open source counterpart Hadoop [33] are distributed computing frameworks designed to process massive data sets.The MapReduce model is quite novel, since it interleaves sequential and parallel computation. Succinctly, MapReduce consists of several rounds of computation. There is a set of machines, each of which has a certain amount of memory available. The memory on each machine is limited, and there is no communication between the machines during a round. In each round, the data is distributed among the machines. The data assigned to a single machine is constrained to be sub-linear in the input size. This restriction is motivated by the fact that the input size is assumed to be very large [26,15]. After the data is distributed, each of the machines performs some computation on the data that is available to them. The output of these computations is either the final result or it becomes the input of another MapReduce round. A more precise overview of the MapReduce model is given in Section 1.1. Problems:In this paper, we are concerned with designing clustering algorithms that can be implemented using MapReduce. In particular, we focus on two well-studied problems: metric k-median and k-center. In both of these problems, we are given a set V of n points, together with the distances between any pair of points; we give a precise description of the input representation below. The goal is to choose k of the points. Each of the k chosen points represents a cluster and is refer...
We study the Minimum Submodular-Cost Allocation problem (MSCA). In this problem we are given a finite ground set V and k non-negative submodular set functions f 1 , . . . , f k on V . The objective is to partition V into k (possibly empty) sets A 1 , · · · , A k such that the sum k i=1 f i (A i ) is minimized. Several well-studied problems such as the non-metric facility location problem, multiway-cut in graphs and hypergraphs, and uniform metric labeling and its generalizations can be shown to be special cases of MSCA. In this paper we consider a convex-programming relaxation obtained via the Lovász-extension for submodular functions. This allows us to understand several previous relaxations and rounding procedures in a unified fashion and also develop new formulations and approximation algorithms for several problems. In particular, we give a (1.5 − 1/k)-approximation for the hypergraph multiway partition problem. We also give a min{2(1−1/k), H ∆ }-approximation for the hypergraph multiway cut problem when ∆ is the maximum hyperedge size. Both problems generalize the multiway cut problem in graphs and the hypergraph cut problem is approximation equivalent to the node-weighted multiway cut problem in graphs.
We study algorithms for the SUBMODULAR MULTIWAY PARTITION problem (SUB-MP). An instance of SUB-MP consists of a finite ground set V , a subset of k elements S = {s 1 , s 2 , . . . , s k } called terminals, and a non-negative submodular set function f : 2 V → R + on V provided as a value oracle. The goal is tois minimized. SUB-MP generalizes some well-known problems such as the MULTIWAY CUT problem in graphs and hypergraphs, and the NODE-WEIGHED MULTIWAY CUT problem in graphs. SUB-MP for arbitrary submodular functions (instead of just symmetric functions) was considered by Zhao, Nagamochi and Ibaraki [25]. Previous algorithms were based on greedy splitting and divide and conquer strategies. In very recent work [4] we proposed a convex-programming relaxation for SUB-MP based on the Lovász-extension of a submodular function and showed its applicability for some special cases. In this paper we obtain the following results for arbitrary submodular functions via this relaxation.• A 2-approximation for SUB-MP. This improves the (k − 1)-approximation from [25].• A (1.5 − 1/k)-approximation for SUB-MP when f is symmetric. This improves the 2(1 − 1/k)-approximation from [20,25].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.