Interactionally Embedded Gestalt Principles of Multimodal Human Communication

Trujillo, James P.; Holler, Judith

doi:10.1177/17456916221141422

Cited by 21 publications

(12 citation statements)

References 156 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Specifically, the present study provides evidence that the interpretation of the individual signals does not directly predict the interpretation of the combinations of signals, which also supports the notion of non-additive, Gestalt-like interpretation. This non-additive contribution of multimodal signals is directly in line with a recent theoretical framework of face-to-face communication 19 , and thus provides, to the best our knowledge, the first evidence for such Gestalt-like multimodal utterance perception.…”

Section: Discussionsupporting

confidence: 85%

“…This is because while individual facial signals may combine into more complex messages in a way that they should be considered compositional, the psychological processes underlying their interpretation into meaningful messages may closely follow Gestalt-psychological processes, in particular the notion that “the whole is more than the sum of its parts”—where the original actually translated ‘not the same’ as the sum of its parts 39 . This fits well with recent theoretical framings of human multimodal language as Gestalt-like 15 , 19 , 40 – 43 . It also fits with the recent, more flexible notion of compositionality in the animal literature 28 , 30 .…”

Section: Introductionsupporting

confidence: 89%

“…hand gestures and body shifts). Also, the larger interactional embedding of an utterance, which includes discourse context, learned idiosyncrasies of the speaker, and the observer’s own affective state all likely shape utterance interpretation 19 . Finally, the present study only assessed how the general population responded to multimodal Gestalts.…”

Section: Discussionmentioning

confidence: 99%

See 2 more Smart Citations

Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions

Trujillo,

Holler

2024

Sci Rep

Self Cite

View full text Add to dashboard Cite

Human language is extremely versatile, combining a limited set of signals in an unlimited number of ways. However, it is unknown whether conversational visual signals feed into the composite utterances with which speakers communicate their intentions. We assessed whether different combinations of visual signals lead to different intent interpretations of the same spoken utterance. Participants viewed a virtual avatar uttering spoken questions while producing single visual signals (i.e., head turn, head tilt, eyebrow raise) or combinations of these signals. After each video, participants classified the communicative intention behind the question. We found that composite utterances combining several visual signals conveyed different meaning compared to utterances accompanied by the single visual signals. However, responses to combinations of signals were more similar to the responses to related, rather than unrelated, individual signals, indicating a consistent influence of the individual visual signals on the whole. This study therefore provides first evidence for compositional, non-additive (i.e., Gestalt-like) perception of multimodal language.

show abstract

Section: Discussionsupporting

confidence: 85%

Section: Introductionsupporting

confidence: 89%

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions

Trujillo,

Holler

2024

Sci Rep

Self Cite

View full text Add to dashboard Cite

show abstract

“…For example, eyebrow frowns often accompany a raising voice pitch to signal the intention to ask a question ( Nota et al, 2021 ). Mechanisms of Gestalt perception ( Wagemans et al, 2012 ), social affordance ( Gallagher, 2020 ), and relevance ( Sperber and Wilson, 1995 ) may jointly contribute to the recognition of multimodal communicative gestalts ( Trujillo and Holler, 2023 ). Finally, the recognition of a specific social action may trigger top-down multilevel predictions about how the message will unfold in time.…”

Section: Multimodal Processing In Face-to-face Interactions: a Possib...mentioning

confidence: 99%

“…Specifically, this supports a parallel processing framework whereby the beginning of the message simultaneously activates multiple potential interpretations (i.e., multimodal gestalts). As the message unfolds, concurrent bottom-up sensory processing and multilevel predictions iteratively refine each other toward a final gestalt solution ( Trujillo and Holler, 2023 ). Such a parallel account accommodates evidence that processing of communicative social actions starts early ( Redcay and Carlson, 2015 ), perhaps in parallel to semantic comprehension ( Tomasello et al, 2022 ).…”

Section: Multimodal Processing In Face-to-face Interactions: a Possib...mentioning

confidence: 99%

Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience

Benetti

Ferrari²,

Pavani³

2023

Front. Hum. Neurosci.

View full text Add to dashboard Cite

In face-to-face communication, humans are faced with multiple layers of discontinuous multimodal signals, such as head, face, hand gestures, speech and non-speech sounds, which need to be interpreted as coherent and unified communicative actions. This implies a fundamental computational challenge: optimally binding only signals belonging to the same communicative action while segregating signals that are not connected by the communicative content. How do we achieve such an extraordinary feat, reliably, and efficiently? To address this question, we need to further move the study of human communication beyond speech-centred perspectives and promote a multimodal approach combined with interdisciplinary cooperation. Accordingly, we seek to reconcile two explanatory frameworks recently proposed in psycholinguistics and sensory neuroscience into a neurocognitive model of multimodal face-to-face communication. First, we introduce a psycholinguistic framework that characterises face-to-face communication at three parallel processing levels: multiplex signals, multimodal gestalts and multilevel predictions. Second, we consider the recent proposal of a lateral neural visual pathway specifically dedicated to the dynamic aspects of social perception and reconceive it from a multimodal perspective (“lateral processing pathway”). Third, we reconcile the two frameworks into a neurocognitive model that proposes how multiplex signals, multimodal gestalts, and multilevel predictions may be implemented along the lateral processing pathway. Finally, we advocate a multimodal and multidisciplinary research approach, combining state-of-the-art imaging techniques, computational modelling and artificial intelligence for future empirical testing of our model.

show abstract

A Roadmap for Technological Innovation in Multimodal Communication Research

Gregori,

Amici,

Brilmayer

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Multimodal communication research focuses on how different means of signalling coordinate to communicate effectively. This line of research is traditionally influenced by fields such as cognitive and neuroscience, human-computer interaction, and linguistics. With new technologies becoming available in fields such as natural language processing and computer vision, the field can increasingly avail itself of new ways of analyzing and understanding multimodal communication. As a result, there is a general hope that multimodal research may be at the "precipice of greatness" due to technological advances in computer science and Supported by the DFG priority program Visual Communication (ViCom).

show abstract

Interactionally Embedded Gestalt Principles of Multimodal Human Communication

Cited by 21 publications

References 156 publications

Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions

Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions

Multimodal processing in face-to-face interactions: A bridging link between psycholinguistics and sensory neuroscience

A Roadmap for Technological Innovation in Multimodal Communication Research

Contact Info

Product

Resources

About