This chapter is a methodological and ethical examination of the following research question: How does one transcribe and analyze the presence and practices of the kindergarten’s youngest children, and other human and non-human actors, in a way that shows their agency and contributions? Video recording is a common method of data generation in social science research and poses the challenge of bringing the audiovisual data into academic texts. Traditionally, video recordings are transformed into pure textual representations called transcription. In particular, it is often the audible speech acts that inform the transcription and analysis, at the expense of the visuals in the video recordings. The embodied, experiential, aesthetic, material and multimodal can be difficult to present through written language alone. The significance of materiality, nonverbal behavior and bodily interaction is enhanced with one- and two-year-old children, whose expressions are often dominated by the nonverbal. Through examples from pedagogical video research in kindergarten, the author explores how different ways of transcribing and analyzing video can visualize the presence, interactions, and practices of the youngest children, the kindergarten teachers and other actors. A visual turn toward poetic video-transcription, multimodal transcription, and a hybrid between drawings and transcription the researcher has named ‘cartoon transcription’, helps to limit marginalization of the different actors and their multifaceted contributions. Breaking the hegemony of the word over the visual can make bodily, spatial, and material resources visible. A fusion of words and images can produce new forms of knowledge, contribute to a high epistemological standard and provide transparency.