Ambiguity detection in multimodal systems

Caschera, Maria Chiara; Ferri, Fernando; Grifoni, Patrizia

doi:10.1145/1385569.1385625

Cited by 13 publications

(6 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As defined in [102], each terminal element E i is identified by a set of meaningful features, as follows: E i mod corresponds to the modality (e.g., speech, facial expression, gesture) used to create the element E i ; E i repr indicates how the element E i is represented by the modality; E i time measures the time interval (based on the start and end time values) over which the element E i was created; E i role corresponds to the syntactic role that the element E i plays in the multimodal sentence, according to the Penn Treebank Tag set [103] (e.g., noun, verb, adjective, adverb, pronoun, preposition, etc. ); and E i concept gives the semantic meaning of the element considering the conceptual structure of the context [104]. Given two elements E i and E j , where E j has a close-by relationship with E j [7], E i coop is set to the same value as E j and specifies the type of cooperation [7] between the elements E i and E j .…”

Section: Representationmentioning

confidence: 99%

See 1 more Smart Citation

Emotion Classification from Speech and Text in Videos Using a Multimodal Approach

Caschera

Grifoni

Ferri

2022

MTI

View full text Add to dashboard Cite

Emotion classification is a research area in which there has been very intensive literature production concerning natural language processing, multimedia data, semantic knowledge discovery, social network mining, and text and multimedia data mining. This paper addresses the issue of emotion classification and proposes a method for classifying the emotions expressed in multimodal data extracted from videos. The proposed method models multimodal data as a sequence of features extracted from facial expressions, speech, gestures, and text, using a linguistic approach. Each sequence of multimodal data is correctly associated with the emotion by a method that models each emotion using a hidden Markov model. The trained model is evaluated on samples of multimodal sentences associated with seven basic emotions. The experimental results demonstrate a good classification rate for emotions.

show abstract

Section: Representationmentioning

confidence: 99%

“…Interact. 2022, 6, x FOR PEER REVIEW semantic meaning of the element considering the conceptual structure of the [104]. Given two elements E i and E j , where E j has a close-by relationship with E j [7] set to the same value as E j and specifies the type of cooperation [7] between the el E i and E j .…”

Section: Representationmentioning

confidence: 99%

Emotion Classification from Speech and Text in Videos Using a Multimodal Approach

Caschera

Grifoni

Ferri

2022

MTI

View full text Add to dashboard Cite

show abstract

“…The introduction of a classificatory step before the ambiguity solution allows adopting a systematic and modular approach. We start from the idea that an incorrect (i.e., ambiguous) interpretation implies the identification of the meaningful features to be managed for solving the ambiguity [22]. This paper goes beyond the static classification process proposed in [10] and provides a dynamic approach modeling knowledge about multimodal ambiguities.…”

Section: Problem Statementmentioning

confidence: 99%

DAMA: A Dynamic Classification of Multimodal Ambiguities

Grifoni¹,

Caschera²,

Ferri³

2020

IJCIS

Self Cite

View full text Add to dashboard Cite

Ambiguities represent uncertainty but also a fundamental item of discussion for who is interested in the interpretation of languages and it is actually functional for communicative purposes both in human-human communication and in human-machine interaction. This paper faces the need to address ambiguity issues in human-machine interaction. It deals with the identification of the meaningful features of multimodal ambiguities and proposes a dynamic classification method that characterizes them by learning, and progressively adapting with the evolution of the interaction language, by refining the existing classes, or by identifying new ones. A new class of ambiguities can be added by identifying and validating the meaningful features that characterize and distinguish it compared to the existing ones. The experimental results demonstrate improvement in the classification rate over considering new ambiguity classes.

show abstract

“…This paper discusses the classification step proposing a new classification that extends and reformulates the ambiguity classifications presented for Natural Language (NL) [15] and Visual Languages (VLs) [16] and evolves previous work on multimodal ambiguities [17].…”

Section: Figure 1 Steps Of the Multimodal Interaction Managementmentioning

confidence: 99%