Background: We propose an efficient and biologically sensitive algorithm based on repeated random walks (RRW) for discovering functional modules, e.g., complexes and pathways, within large-scale protein networks. Compared to existing cluster identification techniques, RRW implicitly makes use of network topology, edge weights, and long range interactions between proteins.
The identification of clusters, well-connected components in a graph, is useful in many applications from biological function prediction to social community detection. However, finding these clusters can be difficult as graph sizes increase. Most current graph clustering algorithms scale poorly in terms of time or memory. An important insight is that many clustering applications need only the subset of best clusters, and not all clusters in the entire graph. In this paper we propose a new technique, Top Graph Clusters (TopGC), which probabilistically searches large, edge weighted, directed graphs for their best clusters in linear time. The algorithm is inherently parallelizable, and is able to find variable size, overlapping clusters. To increase scalability, a parameter is introduced that controls memory use. When compared with three other state-of-the art clustering techniques, TopGC achieves running time speedups of up to 70% on large scale real world datasets. In addition, the clusters returned by TopGC are consistently found to be better both in calculated score and when compared on real world benchmarks.
Social media, including Facebook, Twitter, Instagram, and Snapchat offer new means of communication, networking, and community building. Social media are mechanisms by which millions of people spread, share, and exchange information—ranging from sports and politics, to health and illness. Twitter users, in particular, also build communities on topics of interest. This paper examines Twitter content to examine the extent to which the topic of “violence against women” is posted and disseminated. We know very little about the intersection of social media and the social problem of “violence against women.” Is Twitter being used to advance advocacy efforts, seek information and assistance, and/or build communities among advocates and or victims? First, we need to know whether and to what degree Twitter contains posts on the topic of violence against women (VAW). This paper offers the first exploration into Twitter postings related to the topic of VAW. We collected 2.5 million tweets posted from 2007 through 2015. We then classified postings (referred to as “Tweets”). We compared posting on the topic of VAW to posting related to nine topics: politics, entertainment, sports, women, relationships, fashion, kids, school, and food. We found a small but actively engaged community that Tweets about VAW. Twitter users who post on the topic of VAW reply to one another in each conversation thread, but they rarely disseminate conversations through Retweeting. Our exploratory findings suggest that more might be learned from future studies that investigate the use of social media on the topic of VAW.
Social and communication networks across the world generate vast amounts of graph-like data each day. The modeling and prediction of how these communication structures evolve can be highly useful for many applications. Previous research in this area has focused largely on using past graph structure to predict future links. However, a useful observation is that many graph datasets have additional information associated with them beyond just their graph structure. In particular, communication graphs (such as email, twitter, blog graphs, etc.) have information content associated with their graph edges. In this paper we examine the link between information content and graph structure, proposing a new graph modeling approach, GC-Model, which combines both. We then apply this model to multiple real world communication graphs, demonstrating that the built models can be used effectively to predict future graph structure and information flow. On average, GC-Model's top predictions covered 19% more of the actual future graph communication structure when compared to other previously introduced algorithms, far outperforming multiple link prediction methods and several naive approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.