Science communication is undergoing a digital shift that results in the remediation and emergence of genres that help bring science to expert and semiexpert audiences (Luzón & Pérez-Llantada, 2019, 2022). One such genre is clinical pictures (CPs), which consist of a written and an audiovisual versión of a brief to-the-point presentation of a medical case/condition. This genre, as detailed in this study, may have a clearly stated pedagogical purpose aimed to promote diagnostic expertise. This study explores the structure of CPs, the variety of strategies authors use throughout these CPs to express stance and promote engagement with the audience, and the multimodal configuration of this genre. For this purpose, we draw on a dataset consisting of 10 CPs samples provided by The Lancet. Methodologically, we first conduct a linguistic análisis centred on rhetorical steps and interpersonal strategies, and subsequently, a multimodal analysis to identify the configuration of CPs. Overall, results show the use of interpersonal strategies throughout the two versions/formats, the added value of adopting a multimodal approach to explore data, and the complementarity of the two versions to disseminate medical knowledge among doctors and doctors in training. Pedagogically, the outcomes of the study support the incorporation of this innovative genre in ESP and EMI classes to enhance students’ multimodal literacy.