Multiple phenomena often diffuse through a social network, sometimes in competition with one another. Product adoption and political elections are two examples where network diffusion is inherently competitive in nature. For example, individuals may choose to only select one product from a set of competing products (i.e. most people will need only one cell-phone provider) or can only vote for one person in a slate of political candidate (in most electoral systems). We introduce the weighted generalized annotated program (wGAP) framework for expressing competitive diffusion models. Applications are interested in the eventual results from multiple competing diffusion models (e.g. what is the likely number of sales of a given product, or how many people will support a particular candidate). We define the "most probable interpretation" (MPI) problem which technically formalizes this need. We develop algorithms to efficiently solve MPI and show experimentally that our algorithms work on graphs with millions of vertices.
A fundamental challenge in developing high-impact machine learning technologies is balancing the need to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this paper, we introduce two new formalisms for modeling structured data, and show that they can both capture rich structure and scale to big data. The first, hingeloss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. We unite three approaches from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three lead to the same inference objective. We then define HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define using a syntax based on first-order logic. We introduce an algorithm for inferring most-probable variable assignments (MAP inference) that is much more scalable than general-purpose convex optimization methods, because it uses message passing to take advantage of sparse dependency structures. We then show how to learn the parameters of HL-MRFs. The learned HL-MRFs are as accurate as analogous discrete models, but much more scalable. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible.
There has been extensive work in many different fields on how phenomena of interest (e.g., diseases, innovation, product adoption) "diffuse" through a social network. As social networks increasingly become a fabric of society, there is a need to make "optimal" decisions with respect to an observed model of diffusion. For example, in epidemiology, officials want to find a set of k individuals in a social network which, if treated, would minimize spread of a disease. In marketing, campaign managers try to identify a set of k customers that, if given a free sample, would generate maximal "buzz" about the product. In this article, we first show that the well-known Generalized Annotated Program (GAP) paradigm can be used to express many existing diffusion models. We then define a class of problems called Social Network Diffusion Optimization Problems (SNDOPs). SNDOPs have four parts: (i) a diffusion model expressed as a GAP, (ii) an objective function we want to optimize with respect to a given diffusion model, (iii) an integer k > 0 describing resources (e.g., medication) that can be placed at nodes, (iv) a logical condition VC that governs which nodes can have a resource (e.g., only children above the age of 5 can be treated with a given medication). We study the computational complexity of SNDOPs and show both NP-completeness results as well as results on complexity of approximation. We then develop an exact and a heuristic algorithm to solve a large class of SNDOPproblems and show that our GREEDY-SNDOP algorithm achieves the best possible approximation ratio that a polynomial algorithm can achieve (unless P = NP). We conclude with a prototype experimental implementation to solve SNDOPs that looks at a real-world Wikipedia dataset consisting of over 103,000 edges. ACM Reference Format:Shakarian, P., Broecheler, M., Subrahmanian, V. S. and Molinaro, C. 2013. Using generalized annotated programs to solve social network diffusion optimization problems.
The Social Semantic Web (SSW) refers to the mix of RDF data in web content, and social network data associated with those who posted that content. Applications to monitor the SSW are becoming increasingly popular. For instance, marketers want to look for semantic patterns relating to the content of tweets and Facebook posts relating to their products. Such applications allow multiple users to specify patterns of interest, and monitor them in real-time as new data gets added to the web or to a social network. In this paper, we develop the concept of SSW view servers in which all of these types of applications can be simultaneously monitored from such servers. The patterns of interest are views. We show that a given set of views can be compiled in multiple possible ways to take advantage of common substructures, and define the concept of an optimal merge. We develop a very fast MultiView algorithm that scalably and efficiently maintains multiple subgraph views. We show that our algorithm is correct, study its complexity, and experimentally demonstrate that our algorithm can scalably handle updates to hundreds of views on real-world SSW databases with up to 540M edges.
Probabilistic logic programs (PLPs) define a set of probability distribution functions (PDFs) over the set of all Herbrand interpretations of the underlying logical language. When answering a query Q, a lower and upper bound on Q is obtained by optimizing (min and max) an objective function subject to a set of linear constraints whose solutions are the PDFs mentioned above. A common critique not only of PLPs but many probabilistic logics is that the difference between the upper bound and lower bound is large, thus often providing very little useful information in the query answer. In this paper, we provide a new method to answer probabilistic queries that tries to come up with a histogram that "maps" the probability that the objective function will have a value in a given interval, subject to the above linear constraints. This allows the system to return to the user a histogram where he can directly "see" what the most likely probability range for his query will be. We prove that computing these histograms is #P -hard, and show that computing these histograms is closely related to polyhedral volume computation. We show how existing randomized algorithms for volume computation can be adapted to the computation of such histograms. A prototype experimental implementation is discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.