Findings of the Association for Computational Linguistics: EMNLP 2021 2021
DOI: 10.18653/v1/2021.findings-emnlp.174
|View full text |Cite
|
Sign up to set email alerts
|

Natural SQL: Making SQL Easier to Infer from Natural Language Specifications

Abstract: Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specifically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts for in the text descriptions; (2) r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 29 publications
(18 citation statements)
references
References 28 publications
0
18
0
Order By: Relevance
“…Table 2 presents the difference between the SQL in Spider and the SQL generated by NatSQL in Spider-SS. Our evaluation results are lower than the original NatSQL dataset (Gan et al, 2021b) because the Spider-SS uses equivalent SQL and corrects some errors, as discussed in Section 2.3. Some equivalent and corrected SQL cannot get positive results in exact match metric and execution match.…”
Section: Dataset Analysismentioning
confidence: 86%
See 4 more Smart Citations
“…Table 2 presents the difference between the SQL in Spider and the SQL generated by NatSQL in Spider-SS. Our evaluation results are lower than the original NatSQL dataset (Gan et al, 2021b) because the Spider-SS uses equivalent SQL and corrects some errors, as discussed in Section 2.3. Some equivalent and corrected SQL cannot get positive results in exact match metric and execution match.…”
Section: Dataset Analysismentioning
confidence: 86%
“…The difficulty criteria are defined by Spider benchmark, including easy, medium, hard and extra hard. Experiments show that the more difficult the SQL is, the more difficult it is to predict correctly Shi et al, 2021;Gan et al, 2021b). It can be found from Table 3 that the difficulty distribution of CG-SUB T and CG-SUB D is similar to that of Spider D .…”
Section: Dataset Analysismentioning
confidence: 90%
See 3 more Smart Citations