Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.197
|View full text |Cite
|
Sign up to set email alerts
|

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

Abstract: The lack of publicly available evaluation data for low-resource languages limits progress in Spoken Language Understanding (SLU). As key tasks like intent classification and slot filling require abundant training data, it is desirable to reuse existing data in high-resource languages to develop models for low-resource scenarios. We introduce XSID, a new benchmark for cross-lingual (X) Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect. To tackle the challe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
33
0
5

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 28 publications
(41 citation statements)
references
References 32 publications
0
33
0
5
Order By: Relevance
“…Results Our main results (Figure 4) show the baselines against ISO, AOC, and WSE of both datasets. We evaluate with two types of F1, following van der Goot et al [20]: strict and loose-F1. For full model fine-tuning, RoBERTa achieves 91.31 and 98.55 strict and loose F1 on Sayfullina respectively.…”
Section: Analysis Of Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Results Our main results (Figure 4) show the baselines against ISO, AOC, and WSE of both datasets. We evaluate with two types of F1, following van der Goot et al [20]: strict and loose-F1. For full model fine-tuning, RoBERTa achieves 91.31 and 98.55 strict and loose F1 on Sayfullina respectively.…”
Section: Analysis Of Resultsmentioning
confidence: 99%
“…Definition F1 As mentioned, we evaluate with two types of F1-scores, following van der Goot et al [20]. The first type is the commonly used span-F1, where only the correct span and label are counted towards true positives.…”
Section: Tablementioning
confidence: 99%
“…Most existing datasets, however, either cover multiple domains in a single language (Hakkani-Tür et al, 2016; or the same domain across different languages (Xu et al, 2020). Fortunately, the most recent generation of NLU datasets (Li et al, 2021;van der Goot et al, 2021;Majewska et al, 2022) is both multi-lingual and multi-domain, thus opening up the possibility to assess the true generality of current cross-lingual transfer approaches. Table 3: Multilingual DST datasets.…”
Section: Natural Language Understanding (Nlu)mentioning
confidence: 99%
“…Secondly, since direct translation still dominates multilingual ToD data collection, there have been several approaches to lower human effort in the translation procedure. In most cases translators would simultaneously annotate the datasets with slot labels and/or dialogue states, depending on the tasks the dataset covers Xu et al, 2020;van der Goot et al, 2021). One approach simplifies the translation process itself, which typically proceeds in two stages: (i) machine translation into the target language; (ii) manual post-editing by native speakers of the language (Zuo et al, 2021;Hung et al, 2022).…”
Section: Outlook For Multilingual Tod Datasetsmentioning
confidence: 99%
“…Preliminary findings showed that, among the Other cases, about 56 of the completions provided by BERT are unacceptable and 34 of them are dubious acceptable i.e. not clearly recognizable as acceptable 6 , as in the case of the following sentence 7 : Secondo gli esperti, in Italia i giovani leggono meno i giornali rispetto ai giovani di altri Paesi europei, ... rispetto agli anni passati i giovani tra i 14 e i 19 anni leggono più spesso i giornali. [perché anche però].…”
Section: Testing the Sensitivity Of Neural Language Models To Connect...unclassified