Graphs are fundamental data structures and have been employed for centuries to model real-world systems and phenomena. Random walk with restart (RWR) provides a good proximity score between two nodes in a graph, and it has been successfully used in many applications such as automatic image captioning, recommender systems, and link prediction. The goal of this work is to find nodes that have topk highest proximities for a given node. Previous approaches to this problem find nodes efficiently at the expense of exactness. The main motivation of this paper is to answer, in the affirmative, the question, 'Is it possible to improve the search time without sacrificing the exactness?'. Our solution, Kdash, is based on two ideas: (1) It computes the proximity of a selected node efficiently by sparse matrices, and (2) It skips unnecessary proximity computations when searching for the top-k nodes. Theoretical analyses show that K-dash guarantees result exactness. We perform comprehensive experiments to verify the efficiency of K-dash. The results show that K-dash can find top-k nodes significantly faster than the previous approaches while it guarantees exactness.
Online social media are increasingly facilitating our social interactions, thereby making available a massive “digital fossil” of human behavior. Discovering and quantifying distinct patterns using these data is important for studying social behavior, although the rapid time-variant nature and large volumes of these data make this task difficult and challenging. In this study, we focused on the emergence of “collective attention” on Twitter, a popular social networking service. We propose a simple method for detecting and measuring the collective attention evoked by various types of events. This method exploits the fact that tweeting activity exhibits a burst-like increase and an irregular oscillation when a particular real-world event occurs; otherwise, it follows regular circadian rhythms. The difference between regular and irregular states in the tweet stream was measured using the Jensen-Shannon divergence, which corresponds to the intensity of collective attention. We then associated irregular incidents with their corresponding events that attracted the attention and elicited responses from large numbers of people, based on the popularity and the enhancement of key terms in posted messages or “tweets.” Next, we demonstrate the effectiveness of this method using a large dataset that contained approximately 490 million Japanese tweets by over 400,000 users, in which we identified 60 cases of collective attentions, including one related to the Tohoku-oki earthquake. “Retweet” networks were also investigated to understand collective attention in terms of social interactions. This simple method provides a retrospective summary of collective attention, thereby contributing to the fundamental understanding of social behavior in the digital era.
In this paper, we propose four parallel algorithms (IVPA, SPA, HPA and HPA-ELD) for mining association rules on shared-nothing parallel machines to improve its performance.In NPA, candidate itemsets are just copied amongst all the processors, which can lead to memory overflow for large transaction databases. The remaining three algorithms partition the candidate itemsets over the processors. I f it is partitioned simply (SPA), transaction data has to be broadcast to all processors. HPA partitions the candidate itemsets using a hash function to eliminate broadcasting, which also reduces the comparison workload significantly. HPA-ELD fully utilizes the available memory space by detecting the extremely large itemsets and copying them, which is also very eflective at Battering the load over the processors.W e implemented these algorithms in a sharednothing environment. Performance evaluations show that the best algorithm, HPA-ELD, attains good linearity on speedup ratio and is effective for handling skew.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.