Correctness of SQL queries is usually tested by executing the queries on one or more datasets. Erroneous queries are often the results of small changes or mutations of the correct query. A mutation Q' of a query Q is killed by a dataset D if Q(D) = Q'(D). Earlier work on the XData system showed how to generate datasets that kill all mutations in a class of mutations that included join type and comparison operation mutations.In this paper, we extend the XData data generation techniques to handle a wider variety of SQL queries and a much larger class of mutations. We have also built a system for grading SQL queries using the datasets generated by XData. We present a study of the effectiveness of the datasets generated by the extended XData approach, using a variety of queries including queries submitted by students as part of a database course. We show that the XData datasets outperform predefined datasets as well as manual grading done earlier by teaching assistants, while also avoiding the drudgery of manual correction. Thus, we believe that our techniques will be of great value to database course instructors and TAs, particularly to those of MOOCs. It will also be valuable to database application developers and testers for testing SQL queries.
Grading of student SQL queries is usually done by executing the query on sample datasets (which may be unable to catch many errors) and/or by manually comparing/checking a student query with the correct query (which can be tedious and error prone). In this demonstration we present the XDa-TA system which can be used by instructors and TAs for grading SQL query assignments automatically. Given one or more correct queries for an SQL assignment, the tool uses the XData system to automatically generate datasets that are designed specifically to catch common errors. The grading is then done by comparing the results of student queries with those of the correct queries against these generated datasets; instructors can optionally provide additional datasets for testing. The tool can also be used in a learning mode by students, where it can provide immediate feedback with hints explaining possible reasons for erroneous output. This tool could be of great value to instructors particularly, to instructors of MOOCs.
SQL queries are usually tested for correctness by executing them on one or more datasets, to see if they give the desired results on each dataset. Erroneous queries are often the result of small changes, or mutations, of the correct query. Earlier work on the XData system showed how to generate datasets that kill all mutations in a class of mutations that included join type and comparison operation mutations. However, the system could not handle a number of commonly used SQL features.In this paper we extend the XData data generation techniques to handle features such as null values, string constraints, aggregation with constraints on aggregation results, and a class of subqueries, amongst others. We present a study of the effectiveness of our data generation approach for correcting student SQL assignments that were part of a database course. The datasets generated by XData outperform publicly available datasets, as well as manual grading done earlier by teaching assistants.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.