2021
DOI: 10.48550/arxiv.2112.08321
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CheckDST: Measuring Real-World Generalization of Dialogue State Tracking Performance

Abstract: Recent neural models that extend the pretrainthen-finetune paradigm continue to achieve new state-of-the-art results on joint goal accuracy (JGA) for dialogue state tracking (DST) benchmarks. However, we call into question their robustness as they show sharp drops in JGA for conversations containing utterances or dialog flows with realistic perturbations. Inspired by CheckList (Ribeiro et al., 2020), we design a collection of metrics called CheckDST that facilitate comparisons of DST models on comprehensive d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 21 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?