This study analyzes the interplay of semiotic modes employed by a teacher and music students in a chamber music lesson for instructing, learning, and discussing. In particular, it describes how specific higher-level actions are accomplished through the mutual contextualization of talk and further audible and visible semiotic resources, such as gesture, gaze, material objects, vocalizing, and music. The focus lies on modal complexity, i.e., how different modes cohere to build action, and on modal intensity, i.e., the importance of specific modes related to their useful modal reaches. This study also attends to the linking and coherent coordination of interactional turns by the participants to achieve a mutual understanding of musical ideas and concepts. The rich multimodal texture of instructional, negotiation, and discussion actions in chamber music lessons stresses the role of multimodality and multimodal coherence in investigating music and pedagogy from an interactional perspective.