Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.35
|View full text |Cite
|
Sign up to set email alerts
|

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation

Abstract: Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples.We introduce GOLD as an orthogonal technique that augments existing data to train better OOS dete… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 44 publications
0
7
0
Order By: Relevance
“…As noted by Chen and Yu (2021), PersonaChat is particularly suitable for OOS detection of STAR data because it is a rich source of OOS dialogues. For FLOW, "Int."…”
Section: Results On Oos Detectionmentioning
confidence: 99%
See 2 more Smart Citations
“…As noted by Chen and Yu (2021), PersonaChat is particularly suitable for OOS detection of STAR data because it is a rich source of OOS dialogues. For FLOW, "Int."…”
Section: Results On Oos Detectionmentioning
confidence: 99%
“…with Data Augmentation GOLD (Chen and Yu, 2021) is the data augmentation method most closely related to this work. Given a small set of annotated OOS dialogues (1% of the size of INS), GOLD replaces utterances with sentences selected from an external pool to generate new OOS dialogues.…”
Section: Gold: Generating Out-of-scope Labelsmentioning
confidence: 99%
See 1 more Smart Citation
“…The same paper shows that we can set OOD examples as n + 1 class and train the classification model with other IND classes. The aforementioned approaches can be used on artificially created OOD instances from IND training examples [17] or enlarge known OOD training data [3] with the help of a pretrained language model.…”
Section: Related Workmentioning
confidence: 99%
“…Latent perturbation maps text to a hidden state before mapping back to natural language text again (Zhao et al, 2018). Auxiliary datasets take advantage of external unlabeled data from a relevant domain to form new pseudo-labeled examples (Chen and Yu, 2021). Text generation uses large pre-trained models to create new examples (Devlin et al, 2018).…”
Section: Introductionmentioning
confidence: 99%