Consistent Response Generation with Controlled Specificity

Takayama, Jun-ya; Arase, Yasuji

doi:10.18653/v1/2020.findings-emnlp.396

Cited by 4 publications

(1 citation statement)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…model on specific datasets with task-specific design on model architecture (Wen et al, 2015;Ke et al, 2018;Chen et al, 2019;See et al, 2019) or policy learning strategy (Kawano et al, 2019;Hsueh and Ma, 2020;Takayama and Arase, 2020;Varshney et al, 2021). In this work, we explore effective method for controlled generation on Transformerbased dialogue systems, with the goal of adding controllability functionality into state-of-the-art Transformer-based dialogue systems with lower computation cost, less training data and more flexible control mechanism.…”

Section: Introductionmentioning

confidence: 99%

SideControl: Controlled Open-domain Dialogue Generation via Additive Side Networks

Du¹,

Ji²

2021

Findings of the Association for Computational Linguistics: EMNLP 2021

View full text Add to dashboard Cite

Transformer-based pre-trained language models boost the performance of open-domain dialogue systems.Prior works leverage Transformer-based pre-trained language models to generate texts with desired attributes in two general approaches: (1) gradient-based methods: updating all latent representations of pre-trained models with gradients from attribute models; (2) weighted-decoding methods: re-ranking beam candidates from pretrained models with attribute functions. However, gradient-based methods lead to high computation cost and can easily get overfitted on small training sets, while weighted-decoding methods are inherently constrained by the lowvariance high-bias pre-trained model. In this work, we propose a novel approach to control the generation of Transformer-based pretrained language models: the SIDECONTROL framework, which leverages a novel control attributes loss to incorporate useful control signals, and is shown to perform well with very limited training samples. We evaluate our proposed method on two benchmark open-domain dialogue datasets, and results show that the SIDECONTROL framework has better controllability, higher generation quality and better sample-efficiency than existing gradient-based and weighted-decoding baselines. 1 Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

show abstract