“…In short, language used in face-to-face online communication is a multimodal phenomenon (Kendon, 2014;Perniss, 2018;Vigliocco et al, 2014) enacted through the combination of different resources -speech/signs, body gestures and object manipulations -the use of which is pervasive and, we argue, advantageous to comprehension and learning. In the language as situated framework, these features are not defined negatively (e.g., "non-linguistic" signals, "non-manual" components), but are instead conceived as part and parcel of language (Kendon, 2012(Kendon, , 2014Liddell, 2003;Perniss, 2018;Slobin, 2008;Vigliocco et al, 2014). For this reason, the language as a system view is not incompatible with the perspective proposed here: rather, the latter includes the former, with language as a structure of categorical components being part of a broader, diversified ensemble that constitutes language use situated in the communicative and physical context.…”