Augmenting Neural Response Generation with Context-Aware Topical Attention

Dziri, Nouha; Kamalloo, Ehsan; Mathewson, Kory W.; Zaı̈ane, Osmar R.

doi:10.18653/v1/w19-4103

Cited by 50 publications

(41 citation statements)

References 19 publications

(15 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since we focus on image-grounded conversation, the personality information in the data is discarded. For the scenario of text-based conversation, we use the Reddit Conversation Corpus 1 published by Dziri et al (2018) which contains more than 15M dialogues and each dialogue has at least 3 utterances. We keep 30, 000 most frequent words in the two data as a vocabulary for the text encoder and the response decoder.…”

Section: Methodsmentioning

confidence: 99%

“…For the first scenario, we exploit the image-chat data published in (Shuster et al, 2018), and check if the model learned using both multi-modal and single-modal data can improve upon the state-ofthe-art model learned solely from the multi-modal data, especially when the multi-modal data is small in scale. For the second scenario, we leverage the Reddit Conversation Corpus published by Dziri et al (2018), and examine if latent images can provide useful signals for response generation. Evaluation results indicate that the proposed model can significantly outperform state-of-the-art models in terms of response quality in Scenario I and response informativeness in Scenario II.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Open-Domain Dialogue Generation

Yang

Rong

Zhang

2019

Proceedings of the 2019 7th International Conference on Information Technology: IoT and Smart City

View full text Add to dashboard Cite

We consider grounding open domain dialogues with images. Existing work assumes that both an image and a textual context are available, but image-grounded dialogues by nature are more difficult to obtain than textual dialogues. Thus, we propose learning a response generation model with both imagegrounded dialogues and textual dialogues by assuming that there is a latent variable in a textual dialogue that represents the image, and trying to recover the latent image through text-toimage generation techniques. The likelihood of the two types of dialogues is then formulated by a response generator and an image reconstructor that are learned within a conditional variational auto-encoding framework. Empirical studies are conducted in both imagegrounded conversation and text-based conversation. In the first scenario, image-grounded dialogues, especially under a low-resource setting, can be effectively augmented by textual dialogues with latent images; while in the second scenario, latent images can enrich the content of responses and at the same time keep them relevant to contexts. * Corresponding Author She needs to adjust the helmet. I want to try this, can you teach me?It takes years to master horse-riding skills. I can't just teach you.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Open-Domain Dialogue Generation

Yang

Rong

Zhang

2019

Proceedings of the 2019 7th International Conference on Information Technology: IoT and Smart City

View full text Add to dashboard Cite

show abstract

“…We used the Reddit conversation corpus to train our models. The Reddit conversation corpus, made available by Dziri et al (2018), consists of data extracted from 95 top-ranked subreddits that discuss various topics such as sports, news, education and politics. The corpus contains 9M training examples, 500K development dialogues and 400K dialogues as test data.…”

Section: Data and Modelsmentioning

confidence: 99%

“…• THRED: Topic Augmented Hierarchical Encoder-Decoder (Dziri et al, 2018) which uses topic words along with a hierarchical encoderdecoder to produce a response.…”

Section: Data and Modelsmentioning

confidence: 99%

“…This puts a major imperative on obtaining high-quality crowdsourced human judgments. Previous research which employs crowdsourced judgments has focused on metrics including ease of answering, information flow and coherence (Li et al, 2016;Dziri et al, 2018), naturalness (Asghar et al, 2018), interestingness (Asghar et al, 2017;Santhanam and Shaikh, 2019), fluency or readability (Zhang et al, 2018), engagement (Venkatesh et al, 2018). While experiment designs primarily use Likert scales, Belz and Kow (2010) argue that discrete scales, such as the Likert scales, can be unintuitive and certain individuals may avoid extreme values in their judgments.…”

Section: Introduction and Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Towards Best Experiment Design for Evaluating Dialogue System Output

Santhanam¹,

Shaikh²

2019

Proceedings of the 12th International Conference on Natural Language Generation

View full text Add to dashboard Cite

To overcome the limitations of automated metrics (e.g. BLEU, METEOR) for evaluating dialogue systems, researchers typically use human judgments to provide convergent evidence. While it has been demonstrated that human judgments can suffer from the inconsistency of ratings, extant research has also found that the design of the evaluation task affects the consistency and quality of human judgments. We conduct a between-subjects study to understand the impact of four experiment conditions on human ratings of dialogue system output. In addition to discrete and continuous scale ratings, we also experiment with a novel application of Best-Worst scaling to dialogue evaluation. Through our systematic study with 40 crowdsourced workers in each task, we find that using continuous scales achieves more consistent ratings than Likert scale or ranking-based experiment design. Additionally, we find that factors such as time taken to complete the task and no prior experience of participating in similar studies of rating dialogue system output positively impact consistency and agreement amongst raters.

show abstract