Improving Compositional Generalization with Latent Structure and Data Augmentation

Qiu, Linlu; Shaw, Peter; Pasupat, Panupong; Nowak, Paweł Krzysztof; Linzen, Tal; Sha, Fei; Toutanova, Kristina

doi:10.18653/v1/2022.naacl-main.323

Cited by 21 publications

(25 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On structural generalization in particular, the accuracy of all these models is below 10%, with the exception of Zheng and Lapata (2021), who achieve 39% on PP recursion. By contrast, the compositional model of Liu et al (2021) and the model of Qiu et al (2022), which uses compositional data augmentation, achieve accuracies upwards of 98% on the full generalization set.…”

Section: Compositional Generalization In Cogsmentioning

confidence: 94%

“…This points to a fundamental tension between broad-coverage semantic parsing on natural text and the ability to generalize compositionally from structurally limited synthetic training sets (see also Shaw et al, 2021). To our knowledge, the only parser that does well on both is the CSL-T5 system of Qiu et al (2022), which fine-tunes T5 using a complex data augmentation (DA) method involving synchronous grammars.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Compositional generalization with a broad-coverage semantic parser

Weißenhorn¹,

Donatelli²,

Koller³

2022

Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

View full text Add to dashboard Cite

We show how the AM parser, a compositional semantic parser (Groschwitz et al., 2018), can solve compositional generalization on the COGS dataset. It is the first semantic parser that achieves high accuracy on both naturally occurring language and the synthetic COGS dataset. We discuss implications for corpus and model design for learning human-like generalization. Our results suggest that compositional generalization can be best achieved by building compositionality into semantic parsers.

show abstract

Section: Compositional Generalization In Cogsmentioning

confidence: 94%

Section: Introductionmentioning

confidence: 99%

Compositional generalization with a broad-coverage semantic parser

Weißenhorn¹,

Donatelli²,

Koller³

2022

Proceedings of the 11th Joint Conference on Lexical and Computational Semantics

View full text Add to dashboard Cite

show abstract

“…To also address natural language variations in non-synthetic tasks, some recent works exploit structure of the source input and its relation to the target side (Herzig and Berant, 2021;Weißenhorn et al, 2022), and employ sourceside parsing that can be computationally demanding for long sentences, and may have coverage challenge and not available in all languages; while we try to exploit target-side structure only for higher efficiency. Some other works leverage source-side structure for data augmentation to overcome distribution divergence (Yang et al, 2022b;Qiu et al, 2022), which can clearly help but is not the focus of this paper. Grammar-based decoding has shown to help semantic parsing on in-distribution data (Krishnamurthy et al, 2017;Yin and Neubig, 2017).…”

Section: Modelingmentioning

confidence: 99%

Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing

Zheng¹,

Chow²,

Shen³

et al. 2023

Findings of the Association for Computational Linguistics: ACL 2023

View full text Add to dashboard Cite

Sequence-to-sequence (seq2seq) models have achieved great success in semantic parsing tasks, but they tend to struggle on out-of-distribution (OOD) data. Despite recent progress, robust semantic parsing on large-scale tasks that combine challenges from both compositional generalization and natural language variations remains an unsolved issue. To encourage research in this area, this work introduces CUDON, a large-scale dialogue dataset in the Chinese language, specifically created to evaluate the compositional generalization of semantic parsing. The dataset contains about ten thousand multi-turn complex queries, and provides multiple splits with different degrees of train-test distribution divergence. We have investigated improving compositional generalization through grammar-based decoding on this dataset. With specially designed grammars that leverage program schema, we are able to significantly improve the accuracy of seq2seq semantic parsers on OOD splits: a LSTM-based parser using a Context-free Grammar (CFG) achieves over 25% higher accuracy than a standard seq2seq baseline; a parser using Tree-Substitution Grammar (TSG) improves parsing speed by five to seven times over the CFG parser with only a small accuracy loss. The grammar-based LSTM parsers also outperforms BART-and T5-based seq2seq parsers on the OOD splits, despite having less than one tenth of the parameters and no pretraining. We also validated our approach on the SMCalflow-CS dataset, specifically on the zero-shot learning task.

show abstract

“…More generally our approach extends the recent line of work on neural parameterizations of classic grammars (Jiang et al, 2016;Han et al, 2017Han et al, , 2019Kim et al, 2019;Jin et al, 2019;Zhu et al, 2020;Yang et al, 2021b,a;Zhao and Titov, 2020, inter alia), although unlike in these works we focus on the transduction setting. Data Augmentation Our work is also related to the line of work on utilizing grammatical or alignment structures to guide flexible neural seq2seq models via data augmentation (Jia and Liang, 2016;Fadaee et al, 2017;Andreas, 2020;Akyürek et al, 2021;Shi et al, 2021;Yang et al, 2022;Qiu et al, 2022) or auxiliary supervision (Cohn et al, 2016;Mi et al, 2016;Liu et al, 2016;. In contrast to these works our data augmentation module has stronger inductive biases for hierarchical structure due to explicit use of latent tree-based alignments.…”

Section: Low Resource Mt With Pretrained Modelsmentioning

confidence: 99%

Hierarchical Phrase-based Sequence-to-Sequence Learning

Wang¹,

Titov²,

Andreas³

et al. 2022

Preprint

View full text Add to dashboard Cite

We describe a neural transducer that maintains the flexibility of standard sequence-tosequence (seq2seq) models while incorporating hierarchical phrases as a source of inductive bias during training and as explicit constraints during inference. Our approach trains two models: a discriminative parser based on a bracketing transduction grammar whose derivation tree hierarchically aligns source and target phrases, and a neural seq2seq model that learns to translate the aligned phrases one-byone. We use the same seq2seq model to translate at all phrase scales, which results in two inference modes: one mode in which the parser is discarded and only the seq2seq component is used at the sequence-level, and another in which the parser is combined with the seq2seq model. Decoding in the latter mode is done with the cube-pruned CKY algorithm, which is more involved but can make use of new translation rules during inference. We formalize our model as a source-conditioned synchronous grammar and develop an efficient variational inference algorithm for training. When applied on top of both randomly initialized and pretrained seq2seq models, we find that both inference modes performs well compared to baselines on small scale machine translation benchmarks.

show abstract

Improving Compositional Generalization with Latent Structure and Data Augmentation

Cited by 21 publications

References 16 publications

Compositional generalization with a broad-coverage semantic parser

Compositional generalization with a broad-coverage semantic parser

Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing

Hierarchical Phrase-based Sequence-to-Sequence Learning

Contact Info

Product

Resources

About