This article examines the sociotechnical imaginary within which contemporary biometric listening or VIA (voice identification and analysis) technologies are being developed. Starting from an examination of a key article on Voiceprint identification written in the 1940s, I interrogate the conceptual link between voice, body, and identity, which was central to these early attempts at technologizing voice identification. By surveying patents that delineate systems for voice identification, collection methods for voice data, and voice analysis, I find that the VIA industry is dependent on the conceptual affixion of voice to identity based on a reduction of voice that sees it as a fixed, extractable, and measurable ‘sound object’ located within the body. This informs the thinking of developers in the VIA industry, resulting in a reframing of the technological shortcomings of voice identification under the rubric of big data. Ultimately, this reframing rationalizes the implementation of audio surveillance systems into existing telecommunications infrastructures through which voice data is acquired on a massive scale.
There is a gap in existing critical scholarship that engages with the ways in which current “machine listening” or voice analytics/biometric systems intersect with the technical specificities of machine learning. This article examines the sociotechnical assemblage of machine learning techniques, practices, and cultures that underlie these technologies. After engaging with various practitioners working in companies that develop machine listening systems, ranging from CEOs, machine learning engineers, data scientists, and business analysts, among others, I bring attention to the centrality of “learnability” as a malleable conceptual framework that bends according to various “ground-truthing” practices in formalizing certain listening-based prediction tasks for machine learning. In response, I introduce a process I call Ground Truth Tracings to examine the various ontological translations that occur in training a machine to “learn to listen.” Ultimately, by further examining this notion of learnability through the aperture of power, I take insights acquired through my fieldwork in the machine listening industry and propose a strategically reductive heuristic through which the epistemological and ethical soundness of machine learning, writ large, can be contemplated.
This study takes a cultural anthropological approach to address the use of music taste as an instrument of self-presentation on online dating platforms by examining the partnership between Spotify and Tinder, which not only allows Tinder users to pick an anthem from Spotify’s catalog, but also displays a list of “top artists” based on data aggregated through their activity on Spotify. Using Cheney-Lippold’s formulation of the “measurable type” and Bucher’s notion of “conscious clicking” as foundational frameworks, this paper offers the term “conscious listening” to explore the ways that users play with their music taste-based identities on Tinder. In order to better theorize this phenomenon, a wide array of thinkers ranging from communications, sociology, and data/platform studies are coalesced along with the results of in-depth interviews with 10 Tinder users to analyze the recent convergence of music streaming and social dating platforms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.