Abstract-The web today is increasingly characterized by social and real-time signals, which we believe represent two frontiers in information retrieval. In this paper, we present Earlybird, the core retrieval engine that powers Twitter's realtime search service. Although Earlybird builds and maintains inverted indexes like nearly all modern retrieval engines, its index structures differ from those built to support traditional web search. We describe these differences and present the rationale behind our design. A key requirement of real-time search is the ability to ingest content rapidly and make it searchable immediately, while concurrently supporting low-latency, highthroughput query evaluation. These demands are met with a single-writer, multiple-reader concurrency model and the targeted use of memory barriers. Earlybird represents a point in the design space of real-time search engines that has worked well for Twitter's needs. By sharing our experiences, we hope to spur additional interest and innovation in this exciting space.
Schools of fish and flocks of birds are examples of self-organized animal groups that arise through social interactions among individuals. We numerically study two individual-based models, which recent empirical studies have suggested to explain self-organized group animal behavior: (i) a zone-based model where the group communication topology is determined by finite interacting zones of repulsion, attraction, and orientation among individuals; and (ii) a model where the communication topology is described by Delaunay triangulation, which is defined by each individual's Voronoi neighbors. The models include a tunable parameter that controls an individual's relative weighting of attraction and alignment. We perform computational experiments to investigate how effectively simulated groups transfer information in the form of velocity when an individual is perturbed. A cross-correlation function is used to measure the sensitivity of groups to sudden perturbations in the heading of individual members. The results show how relative weighting of attraction and alignment, location of the perturbed individual, population size, and the communication topology affect group structure and response to perturbation. We find that in the Delaunay-based model an individual who is perturbed is capable of triggering a cascade of responses, ultimately leading to the group changing direction. This phenomenon has been seen in self-organized animal groups in both experiments and nature.
Information propagation in social media depends not only on the static follower structure but also on the topic-specific user behavior. Hence novel models incorporating dynamic user behavior are needed. To this end, we propose a model for individual social media users, termed a genotype. The genotype is a per-topic summary of a user's interest, activity and susceptibility to adopt new information. We demonstrate that user genotypes remain invariant within a topic by adopting them for classification of new information spread in large-scale real networks. Furthermore, we extract topic-specific influence backbone structures based on content adoption and show that their structure differs significantly from the static follower network. We also find, at the population level using a simple contagion model, that hashtags of a known topic propagate at the greatest rate on backbone networks of the same topic. When employed for influence prediction of new content spread, our genotype model and influence backbones enable more than 20% improvement, compared to purely structural features. It is also demonstrated that knowledge of user genotypes and influence backbones allows for the design of effective strategies for latency minimization of topic-specific information spread.
Abstract-Information propagation in social media depends not only on the static follower structure but also on the topicspecific user behavior. Hence novel models incorporating dynamic user behavior are needed. To this end, we propose a model for individual social media users, termed a genotype. The genotype is a per-topic summary of a user's interest, activity and susceptibility to adopt new information. We demonstrate that user genotypes remain invariant within a topic by adopting them for classification of new information spread in large-scale real networks. Furthermore, we extract topic-specific influence backbone structures based on information adoption and show that they differ significantly from the static follower network. When employed for influence prediction of new content spread, our genotype model and influence backbones enable more than 20% improvement, compared to purely structural features. We also demonstrate that knowledge of user genotypes and influence backbones allow for the design of effective strategies for latency minimization of topic-specific information spread.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.