Kinda El Maarry scite author profile

2013

Skyline queries are a well-established technique for database query personalization and are widely acclaimed for their intuitive query formulation mechanisms. However, when operating on incomplete datasets, skylines queries are severely hampered and often have to resort to highly error-prone heuristics. Unfortunately, incomplete datasets are a frequent phenomenon, especially when datasets are generated automatically using various information extraction or information integration approaches. Here, the recent trend of crowd-enabled databases promises a powerful solution: during query execution, some database operators can be dynamically outsourced to human workers in exchange for monetary compensation, therefore enabling the elicitation of missing values during runtime. Unfortunately, this powerful feature heavily impacts query response times and (monetary) execution costs. In this paper, we present an innovative hybrid approach combining dynamic crowdsourcing with heuristic techniques in order to overcome current limitations. We will show that by assessing the individual risk a tuple poses with respect to the overall result quality, crowdsourcing efforts for eliciting missing values can be narrowly focused on only those tuples that may degenerate the expected quality most strongly. This leads to an algorithm for computing skyline sets on incomplete data with maximum result quality, while optimizing crowd-sourcing costs.

show abstract

Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing

2013

Crowdsourcing for Query Processing onWeb Data: A Case Study on the Skyline Operator

2015

CIT

In recent years, crowdsourcing has become a powerful tool to bring human intelligence into information processing. This is especially important for Web data which in contrast to well-maintained databases is almost always incomplete and may be distributed over a variety of sources. Crowdsourcing allows to tackle many problems which are not yet attainable using machine-based algorithms alone: in particular, it allows to perform database operators on incomplete data as human workers can be used to provide values during runtime. As this can become costly quickly, elaborate optimization is required. In this paper, we showcase how such optimizations can be performed for the popular skyline operator for preference queries. We present some heuristics-based approaches and compare them to crowdsourcing-based approaches using sophisticated optimization techniques while especially focusing on result correctness.

show abstract

Design Patterns for Hybrid Algorithmic-Crowdsourcing Workflows

Maarry²

2014

Skill Ontology-Based Model for Quality Assurance in Crowdsourcing

Cho

et al. 2014