Recommendation systems are ubiquitous and impact many domains; they have the potential to influence product consumption, individuals' perceptions of the world, and life-altering decisions. These systems are often evaluated or trained with data from users already exposed to algorithmic recommendations; this creates a pernicious feedback loop. Using simulations, we demonstrate how using data confounded in this way homogenizes user behavior without increasing utility.
We present a large-scale study of television viewing habits, focusing on how individuals adapt their preferences when consuming content with others. While there has been a great deal of research on modeling individual preferences, there has been considerably less work studying the preferences of groups, due mostly to the difficulty of collecting group data. In contrast to most past work that has relied either on smallscale surveys, prototypes, or a relatively limited amount of group preference data, we explore more than 4 million logged household views paired with individual-level demographic and co-viewing information. Our analysis reveals how engagement in group viewing varies by viewer and content type, and how viewing patterns shift across various group contexts. Furthermore, we leverage this large-scale dataset to directly estimate how individual preferences are combined in group settings, finding subtle deviations from traditional models of preference aggregation. We present a simple model which captures these effects and discuss the impact of these findings on the design of group recommendation systems.
Significant events are characterized by interactions between entities (such as countries, organizations, or individuals) that deviate from typical interaction patterns. Analysts, including historians, political scientists, and journalists, commonly read large quantities of text to construct an accurate picture of when and where an event happened, who was involved, and in what ways. In this paper, we present the Capsule model for analyzing documents to detect and characterize events of potential significance. Specifically, we develop a model based on topic modeling that distinguishes between topics that describe "business as usual" and topics that deviate from these patterns. To demonstrate this model, we analyze a corpus of over two million U.S. State Department cables from the 1970s. We provide an open-source implementation of an inference algorithm for the model and a pipeline for exploring its results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.