Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475234
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Dialog System: Relational Graph-based Context-aware Question Understanding

Abstract: Multimodal dialog system has attracted increasing attention from both academia and industry over recent years. Although existing methods have achieved some progress, they are still confronted with challenges in the aspect of question understanding (i.e., user intention comprehension). In this paper, we present a relational graph-based context-aware question understanding scheme, which enhances the user intention comprehension from local to global. Specifically, we first utilize multiple attribute matrices as t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 23 publications
(12 citation statements)
references
References 39 publications
0
12
0
Order By: Relevance
“…Moreover, they released a large-scale multimodal dialog dataset in the context of online fashion shopping, named MMD, which significantly promotes the research progress on multimodal dialog systems. In particular, several efforts further explore the semantic relation in the multimodal dialog context and incorporate knowledge based on the framework of MHRED [2], [3], [4], [5], [6], [7]. For example, Liao et al [5] developed a taxonomy-based visual semantic learning module to capture the fine-grained semantics (e.g., the category and attributes of a product) in product images, and introduced a memory network to integrate the knowledge of fashion style tips.…”
Section: Task-oriented Dialog Systemsmentioning
confidence: 99%
See 4 more Smart Citations
“…Moreover, they released a large-scale multimodal dialog dataset in the context of online fashion shopping, named MMD, which significantly promotes the research progress on multimodal dialog systems. In particular, several efforts further explore the semantic relation in the multimodal dialog context and incorporate knowledge based on the framework of MHRED [2], [3], [4], [5], [6], [7]. For example, Liao et al [5] developed a taxonomy-based visual semantic learning module to capture the fine-grained semantics (e.g., the category and attributes of a product) in product images, and introduced a memory network to integrate the knowledge of fashion style tips.…”
Section: Task-oriented Dialog Systemsmentioning
confidence: 99%
“…where Z enc l ∈ R M ×D refers to the output of l-th encoder layer, and Z enc 0 is obtained by the aforementioned position-wise embedding layer in Eqn. (2). Z S l ∈ R M ×D is the intermediate output of MSA in the l-th encoder layer.…”
Section: Preliminarymentioning
confidence: 99%
See 3 more Smart Citations