Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture's evolution through its texts using a 'big data' lens. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories and forming patterns that are meaningful to us. Here, by classifying the emotional arcs for a filtered subset of 1,327 stories from Project Gutenberg's fiction collection, we find a set of six core emotional arcs which form the essential building blocks of complex emotional trajectories. We strengthen our findings by separately applying matrix decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads.
We introduce an unsupervised pattern recognition algorithm termed the Discrete Shocklet Transform (DST) by which local dynamics of time series can be extracted. Time series that are hypothesized to be generated by underlying deterministic mechanisms have significantly different DSTs than do purely random null models. We apply the DST to a sociotechnical data source, usage frequencies for a subset of words on Twitter over a decade, and demonstrate the ability of the DST to filter high-dimensional data and automate the extraction of anomalous behavior.
Sports are spontaneous generators of stories. Through skill and chance, the script of each game is dynamically written in real time by players acting out possible trajectories allowed by a sport's rules. By properly characterizing a given sport's ecology of "game stories," we are able to capture the sport's capacity for unfolding interesting narratives, in part by contrasting them with random walks. Here we explore the game story space afforded by a data set of 1310 Australian Football League (AFL) score lines. We find that AFL games exhibit a continuous spectrum of stories rather than distinct clusters. We show how coarse graining reveals identifiable motifs ranging from last-minute comeback wins to one-sided blowouts. Through an extensive comparison with biased random walks, we show that real AFL games deliver a broader array of motifs than null models, and we provide consequent insights into the narrative appeal of real games.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.