DEFINITIONGiven a query on a database schema and a set of views over the same schema, the problem of query rewriting is to find a way to answer the query using only the answers to the views. Rewriting algorithms aim at finding such rewritings efficiently, dealing with possible limited query-answering capabilities on the views, and producing rewritings that are efficient to execute.
HISTORICAL BACKGROUNDQuery rewriting is one of the oldest problems in data management. Earlier studies focused on improving performance of query evaluation [9], since using materialized views can save the execution cost of a query. In 1995, Levy et al. [10] formally studied the problem and developed complexity results. The problem became increasingly more important due to new applications such as data integration, in which views are used widely to describe the semantics of the data at different sources and queries posed on the global schema. Many algorithms have been developed, including the bucket algorithm [11] and the inverse-rules algorithm [15,7]. See [8] for an excellent survey.
SCIENTIFIC FUNDAMENTALSFormally, a query Q 1 is contained in a query Q 2 if for each instance of their database, the answer to Q 1 is always a subset of that to Q 2 . The queries are equivalent if they are contained in each other. Let T be a database schema, and V be a set of views on T . The expansion of a query P using the views in V, denoted by P exp , is obtained from P by replacing all the views in P with their corresponding base relations. Given a query Q on T , a query P is called a contained rewriting of query Q using V if P uses only the views in V, and P exp is contained in Q as queries. P is called an equivalent rewriting of Q using V if P exp and Q are equivalent as queries.Examples: Consider a database with the following three relations about students, courses, and course enrollments:Student(sid, name, dept); Course(cid, title, quarter); Take(sid, cid, grade).
Consider the following query on the database:Query Q1: SELECT C.title, T.grade