2022
DOI: 10.48550/arxiv.2202.13887
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Probing the Robustness of Trained Metrics for Conversational Dialogue Systems

Abstract: This paper introduces an adversarial method to stress-test trained metrics to evaluate conversational dialogue systems. The method leverages Reinforcement Learning to find response strategies that elicit optimal scores from the trained metrics. We apply our method to test recently proposed trained metrics. We find that they all are susceptible to giving high scores to responses generated by relatively simple and obviously flawed strategies that our method converges on. For instance, simply copying parts of the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 10 publications
0
1
0
Order By: Relevance
“…For instance, dropping punctuation or removing certain words does not decrease the scores produced by the automated metric. On the other hand, Deriu et al (2022) showed that when an automated metric is used as a reward for reinforcement learning, the policy converges to sub optimal solutions, which are rated highly by the metric. Thus, a key future direction is to develop automatic metrics which are built to be robust against these kinds of attacks.…”
Section: Robustness Against Gaming the Metricmentioning
confidence: 99%
“…For instance, dropping punctuation or removing certain words does not decrease the scores produced by the automated metric. On the other hand, Deriu et al (2022) showed that when an automated metric is used as a reward for reinforcement learning, the policy converges to sub optimal solutions, which are rated highly by the metric. Thus, a key future direction is to develop automatic metrics which are built to be robust against these kinds of attacks.…”
Section: Robustness Against Gaming the Metricmentioning
confidence: 99%