Second Grand-Challenge and Workshop on Multimodal Language (Challenge-Hml) 2020
DOI: 10.18653/v1/2020.challengehml-1.7
|View full text |Cite
|
Sign up to set email alerts
|

Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents

Abstract: Building multimodal dialogue understanding capabilities situated in the in-cabin context is crucial to enhance passenger comfort in autonomous vehicle (AV) interaction systems. To this end, understanding passenger intents from spoken interactions and vehicle vision systems is an important building block for developing contextual and visually grounded conversational agents for AV. Towards this goal, we explore AMIE (Automated-vehicle Multimodal In-cabin Experience), the in-cabin agent responsible for handling m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 20 publications
0
5
0
Order By: Relevance
“…The most seminal technique of this family is Back-Translation (BT), a technique in which a sentence is translated to a pivot language and then back into English (Hayashi et al, 2018;Yu et al, 2018;Edunov et al, 2018;Corbeil and Ghadivel, 2020;AlAwawdeh and Abandah, 2021). Neural networks to directly generate paraphrases have also been used, with specialized decoding techniques for RNN (Kumar et al, 2019), or by using a BART model trained on a corpus of paraphrases generated from BT (Okur et al, 2022).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The most seminal technique of this family is Back-Translation (BT), a technique in which a sentence is translated to a pivot language and then back into English (Hayashi et al, 2018;Yu et al, 2018;Edunov et al, 2018;Corbeil and Ghadivel, 2020;AlAwawdeh and Abandah, 2021). Neural networks to directly generate paraphrases have also been used, with specialized decoding techniques for RNN (Kumar et al, 2019), or by using a BART model trained on a corpus of paraphrases generated from BT (Okur et al, 2022).…”
Section: Related Workmentioning
confidence: 99%
“…We use the FSMT model from hugging face 5 , with the intermediary language being German, which has been shown to obtain good performances (Edunov et al, 2018). Okur et al (2022) propose to fine-tune BART on a corpus of in-domain paraphrases created with BT. We found in our experiments that we could get results just as good by using T5-small-Tapaco 6 , which is the T5 model fine-tuned on the corpus of paraphrases TaPaCo (Scherrer, 2020).…”
Section: Algorithmsmentioning
confidence: 99%
“…Jointly training Intent Recognition and Entity Extraction models have been explored recently (Zhang and Wang, 2016;Liu and Lane, 2016;Goo et al, 2018;Varghese et al, 2020). Various hierarchical multi-task architectures are also proposed for these joint NLU tasks (Zhou et al, 2016;Wen et al, 2018;Okur et al, 2019;Vanzo et al, 2019), even some in multimodal context (Gu et al, 2017;Okur et al, 2020). Vaswani et al (2017) proposed the Transformer, a novel network architecture based entirely on attention mechanisms (Bahdanau et al, 2014).…”
Section: Natural Language Understandingmentioning
confidence: 99%
“…Paraphrase Generation has been widely studied in Natural Language Understanding tasks such as dialogue systems (Quan and Xiong, 2019;Okur et al, 2022), intent classification (Rentschler et al, 2022) and slot filling (Hou et al, 2021). For Natural Language Generation (NLG), we have found that using multiple references leads to a more robust evaluation Dušek et al, 2020).…”
Section: Amr Graphmentioning
confidence: 99%
“…Paraphrase generation has proven helpful for data augmentation in diverse tasks such as natural language understanding (Okur et al, 2022), question answering, and task-oriented dialog systems (Gao et al, 2020). However, as far as we know, this strategy has yet to be studied for improving the performance of AMR-to- reira et al, 2017; might easily surpass it.…”
Section: Introductionmentioning
confidence: 99%