2021
DOI: 10.48550/arxiv.2112.07194
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(8 citation statements)
references
References 0 publications
0
8
0
Order By: Relevance
“…MAUDE (Sinha et al, 2020) is trained with Noise Contrastive Estimation (NCE) (Gutmann and Hyvärinen, 2010). DEB (Sai et al, 2020) augments training data with manually-created relevant responses and adversarial irrelevant responses and MDD-Eval (Zhang et al, 2021b) adopts a teacher model to augment dialog data across different domains to achieve better performance across domains. GRADE and DynaEval leverages a graph structure to better model the dialog.…”
Section: Automatic Evaluation Metrics For Dialogmentioning
confidence: 99%
See 4 more Smart Citations
“…MAUDE (Sinha et al, 2020) is trained with Noise Contrastive Estimation (NCE) (Gutmann and Hyvärinen, 2010). DEB (Sai et al, 2020) augments training data with manually-created relevant responses and adversarial irrelevant responses and MDD-Eval (Zhang et al, 2021b) adopts a teacher model to augment dialog data across different domains to achieve better performance across domains. GRADE and DynaEval leverages a graph structure to better model the dialog.…”
Section: Automatic Evaluation Metrics For Dialogmentioning
confidence: 99%
“…They propose mechanisms that aggregate pairwise comparisons such as estimating the probability that a given system scores better than another system. FlowScore (Li et al, 2021b) and the use of sentence embeddings in Rodríguez-Cantelar et al (2021) model dynamic information flow in the dialog history in order to evaluate the quality of a di-alog. FBD (Xiang et al, 2021) computes the distribution-wise difference between system generated conversations and human-written conversations to evaluate performance.…”
Section: Automatic Evaluation Metrics For Dialogmentioning
confidence: 99%
See 3 more Smart Citations