This article examines dyadic team work via video conferencing (inter)actions and explicates communicating and accepting knowledge, coordinating attention, and disagreeing. We demonstrate that such knowledge communication, which in the literature quite often is viewed as solely or primarily language-based is, is in fact always multimodal. Communicating knowledge, coordinating attention, and disagreeing are always performed through the interconnection of multiple modes from gaze and gesture, to posture and object handling, and may be produced with or without language. According to our findings presented here, the verbal acceptance of knowledge lags much behind the action that already demonstrated a participant's acceptance of another's knowledge. Language use also tells us little about the attention that a participant may pay, as being quiet might easily be misinterpreted as listening. Further, our findings show that language is never used alone in disagreements, rather, language may build an aggregate with other modes, and language may be super-ordinated or sub-ordinated to other modes in (inter)action. The article illustrates the complexity of everyday knowledge communication, which is relevant for educational and also particularly to organizational settings.