Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.419
|View full text |Cite
|
Sign up to set email alerts
|

Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies

Abstract: We study the degree to which neural sequenceto-sequence models exhibit fine-grained controllability when performing natural language generation from a meaning representation. Using two task-oriented dialogue generation benchmarks, we systematically compare the effect of four input linearization strategies on controllability and faithfulness. Additionally, we evaluate how a phrase-based data augmentation method can improve performance. We find that properly aligning input sequences during training leads to high… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
17
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 15 publications
(19 citation statements)
references
References 29 publications
1
17
0
1
Order By: Relevance
“…Finally, Kedzie and McKeown (2020), appearing contemporaneously to our work, seek to control the output generation by manipulating the input linearization order, using a randomization similar to ours as an "uncontrolled" baseline. Given their focus on task-oriented dialogue planning, which uses simpler meaning representations and sentences than the AMR dataset used here (i.e., shallower graphs and limited domains), we view their work as complementary to our own.…”
Section: Oursmentioning
confidence: 99%
“…Finally, Kedzie and McKeown (2020), appearing contemporaneously to our work, seek to control the output generation by manipulating the input linearization order, using a randomization similar to ours as an "uncontrolled" baseline. Given their focus on task-oriented dialogue planning, which uses simpler meaning representations and sentences than the AMR dataset used here (i.e., shallower graphs and limited domains), we view their work as complementary to our own.…”
Section: Oursmentioning
confidence: 99%
“…But it is non-trivial to integrate planning modules into them. Existing approaches resort to decoupling planning and decoding stages (Hua and Wang, 2020;Kedzie and McKeown, 2020), which inevitably increases system complexities and potentially introduces cascading errors.…”
Section: Related Workmentioning
confidence: 99%
“…The most common approach to ensuring semantic quality relies on over-generating and then reranking candidate outputs using criteria that the model was not explicitly optimized for in training. Reranking in sequence-to-sequence models is typically performed by creating an extensive set of rules, or by training a supplemental classifier, that indicates for each input slot whether it is present in the output utterance (Wen et al, 2015a;Dušek and Jurčíček, 2016;Juraska et al, 2018;Agarwal et al, 2018;Kedzie and McKeown, 2020;Harkous et al, 2020). Wen et al (2015b) proposed an extension of the underlying LSTM cells of their sequence-tosequence model to explicitly track, at each decoding step, the information mentioned so far.…”
Section: Related Workmentioning
confidence: 99%
“…Among the most recent efforts, the jointly-learned segmentation and alignment method of Shen et al (2020) improves semantic accuracy while simultaneously increasing output diversity. Kedzie and McKeown (2020) use segmentation for data augmentation and automatic utterance planning, which leads to a reduction in semantic errors on both the E2E and ViGGO (Juraska et al, 2019) datasets.…”
Section: Related Workmentioning
confidence: 99%