Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.742
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Unexplored Generalization Challenges for Cross-Database Semantic Parsing

Abstract: We study the task of cross-database semantic parsing (XSP), where a system that maps natural language utterances to executable SQL queries is evaluated on databases unseen during training. Recently, several datasets, including Spider, were proposed to support development of XSP systems. We propose a challenging evaluation setup for cross-database semantic parsing, focusing on variation across database schemas and in-domain language use. We re-purpose eight semantic parsing datasets that have been well-studied … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

4
96
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(100 citation statements)
references
References 49 publications
4
96
0
Order By: Relevance
“…(2) On more realistic evaluation settings, including Spider-Realistic and the Suhr et al (2020) datasets, our method outperforms all baselines. This demonstrates the superiority of our pretraining framework in solving the text-table alignment challenge, and its usefulness in practice.…”
Section: Introductionmentioning
confidence: 96%
See 2 more Smart Citations
“…(2) On more realistic evaluation settings, including Spider-Realistic and the Suhr et al (2020) datasets, our method outperforms all baselines. This demonstrates the superiority of our pretraining framework in solving the text-table alignment challenge, and its usefulness in practice.…”
Section: Introductionmentioning
confidence: 96%
“…As pointed out by Suhr et al (2020), existing text-to-SQL benchmarks like Spider (Yu et al, 2018b) render the text-table alignment challenge easier than expected by explicitly mentioning exact column names in the NL utterances. Contrast this to more realistic settings where users may refer to the columns using a variety of expressions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…More recently, large-scale datasets consisting of hundreds of DBs and the corresponding question-SQL pairs have been released Zhong et al, 2017;Yu et al, 2019b,a) to encourage the development of semantic parsers that can work well across different DBs (Guo et al, 2019;Bogin et al, 2019b;Wang et al, 2019;Suhr et al, 2020;Choi et al, 2020). The setup is challenging as it requires the model to interpret a question conditioned on a relational DB unseen during training and accurately express the question intent via SQL logic.…”
Section: Introductionmentioning
confidence: 99%
“…However, it is still difficult for current state-of-the-art models to fill in the skeletons with semantically correct entities, especially when they are required to generalize to unseen DB schemas (Yu et al, 2018;Suhr et al, 2020). To predict the correct entity, the model should have a database (DB) schema grounded understanding of the NL question, which means that the model should be able to jointly learn the semantics in the NL question and the structured knowledge in a given database.…”
Section: Introductionmentioning
confidence: 99%