Range query has a crucial role in large-scale data analysis. Unfortunately, the performance may be severely degraded by data skew. Such problem is often faced in large-scale parallel databases, peer-to-peer (P2P) systems as well as in Cloud computing. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by reducing the cost of global skew knowledge and broadcasting. Ganesan, Bawa and Garcia-Molina proposed a load-balancing algorithm that guarantees a good ratio between the maximum and minimum loads among nodes. However, their algorithm requires global max-min load information to use local load balancing operations. Global load information can be found with O (log n) messages. In order to reduce this cost, we propose OPTIMA, a novel online load balancing approach for range partitioned data. Whenever a partition becomes overloaded, data transfers are performed, in background, from the most loaded nodes to the least loaded ones as in Ganesan et al., work. As a result, the partition boundaries and data sizes change. The key point of our proposal is the imperfect knowledge of the global load information (partition statistics). We introduce the concept of "Imperfect Partitioning Vector" (IPV), where, both nodes and clients have an approximate information about the load distribution. They can nevertheless locate any data with almost the same efficiency as using exact partition statistics. Furthermore, maintaining load distribution statistics do not require exchanging additional messages or maintaining a data structure as opposed to the cost of efficient solutions from the state-of-art (which require at least O (log n) messages).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.