MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

Longxu, Dou,; Gao, Yan; Pan, Mingyang; Dingzirui, Wang,; Che, Wanxiang; Zhan, Dechen; Lou, Jian–Guang

doi:10.1609/aaai.v37i11.26499

Cited by 5 publications

(2 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The efforts of the scientific community to build systems for cross-lingual semantic parsing have led to the development of a series of relevant benchmark datasets (Min et al, 2019;Dou et al, 2023;Zhang et al, 2023). Most recent approaches for cross-lingual semantic parsing have sought to localise parsers to new target languages using backtranslations (Sherborne et al, 2020) or machine translation (Xia and Monti, 2021;Shi et al, 2022).…”

Section: Cross-lingual Semantic Parsingmentioning

confidence: 99%

“…In an effort to alleviate the shortcomings of Textto-SQL solutions, increasingly difficult datasets and benchmarks have been developed (Zhong et al, 2017;Yu et al, 2018;Min et al, 2019;Dou et al, 2023;Zhang et al, 2023). The efforts to explore the generalisability of such Text-to-SQL systems have recently culminated with the introduction of multiple database datasets, which distinguish between training and evaluation databases.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

FastRAT: Fast and Efficient Cross-lingual Text-to-SQL Semantic Parsing

Vougiouklis,

Papasarantopoulos,

Zheng

et al. 2023

Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacifi

View full text Add to dashboard Cite

Recent advances of large pre-trained language models have motivated significant breakthroughs in various Text-to-SQL tasks. However, a number of challenges inhibit the deployment of SQL parsers in commercial applications. In this paper, we focus on two such challenges: decoding speed and multilingual input, and introduce FastRAT, a model that includes (i) a decoder-free framework to quickly generate SQL queries from natural language questions based on SQL Semantic Predictions, (ii) a cross-lingual multi-task pre-training scheme, and (iii) a method, based on distant supervision, to extend a semantic parser to new languages.We apply FastRAT on CSpider and Spider, two challenging zero-shot semantic parsing benchmarks. Our system achieves an average of 10x decoding speedup over a set of competitive baselines based on auto-or semi-autoregressive decoding. In the cross-lingual CSpider dataset, our approach achieves an exact query match accuracy score of 61.3, outperforming the relevant competition. In the monolingual task, it maintains competitive performance by exhibiting < 5% accuracy drop compared to disproportionately slower solutions.

show abstract