Abstract-Preference queries are crucial for various applications (e.g. digital libraries) as they allow users to discover and order data of interest in a personalized way. In this paper, we define preferences as preorders over relational attributes and their respective domains. Then, we rely on appropriate linearizations to provide a natural semantics for the block sequence answering a preference query. Moreover, we introduce two novel rewriting algorithms (called LBA and TBA) which exploit the semantics of preference expressions for constructing progressively each block of the answer. We demonstrate experimentally the scalability and performance gains of our algorithms (up to 3 orders of magnitude) for variable database and result sizes, as well as for preference expressions of variable size and structure. To the best of our knowledge, LBA and TBA are the first algorithms for evaluating efficiently arbitrary preference queries over voluminous databases.
I. INTRODUCTIONWith the Web explosion, an increasing number of users access large data collections without a precise knowledge of their content, or a clearly identified search goal. Users would rather describe features of data that are potentially useful in some task, or in other words features that best suit their preferences. Modern database systems should then be able to process queries enhanced with preferences, and such queries are called preference queries. The answer to a preference query is a sequence of data blocks, where each block contains data that are more interesting (in terms of the preferences) than the data in the following block. In this way, the user can inspect the blocks one by one and stop inspection at any point at which he feels satisfied by the data already inspected. In this paper, we are interested in the efficient computation of such block sequences when data collections are modeled as relational tables and preferences as binary relations over the table attributes and their respective domains.