21st International Conference on Advanced Information Networking and Applications (AINA '07) 2007
DOI: 10.1109/aina.2007.80
|View full text |Cite
|
Sign up to set email alerts
|

Improving distributed join efficiency with extended bloom filter operations

Abstract: Bloom filter based algorithms have proven successful as very efficient technique to reduce communication costs of database joins in a distributed setting. However, the full potential of bloom filters has not yet been exploited. Especially in the case of multi-joins, where the data is distributed among several sites, additional optimization opportunities arise, which require new bloom filter operations and computations. In this paper, we present these extensions and point out how they improve the performance of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
23
0

Year Published

2010
2010
2022
2022

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 32 publications
(23 citation statements)
references
References 21 publications
0
23
0
Order By: Relevance
“…These bit arrays are similar to those employed in traditional Bloom filters and is supported by a sufficiently large body of research work [14], [16], [17] that allows us to estimate number of documents reachable for a multi-concept query solely based on these bit arrays. Similar to level 1, level 2(TSBF 2,P ) also contains multiple bit arrays each representing different multi-concept queries that whose concepts have C as the least common ancestor in the ontology hierarchy for which P has at least one qualified document in its local document collection (TSBF 2,P (C)).…”
Section: Two-level Semantic Bloom Filter (Tsbf)mentioning
confidence: 99%
See 1 more Smart Citation
“…These bit arrays are similar to those employed in traditional Bloom filters and is supported by a sufficiently large body of research work [14], [16], [17] that allows us to estimate number of documents reachable for a multi-concept query solely based on these bit arrays. Similar to level 1, level 2(TSBF 2,P ) also contains multiple bit arrays each representing different multi-concept queries that whose concepts have C as the least common ancestor in the ontology hierarchy for which P has at least one qualified document in its local document collection (TSBF 2,P (C)).…”
Section: Two-level Semantic Bloom Filter (Tsbf)mentioning
confidence: 99%
“…Research community has proposed many works to estimate the cardinality(i.e. number of elements) of an original set solely based on its Bloom filter bit array [14], [16], [17]. For our work we used the work presented by authors of [16].…”
Section: ) Estimating Set Intersection Based Cardinality From Bloom mentioning
confidence: 99%
“…8), we can estimate the number of true bits in the Bloom filter of the intersection of the two sets A ∩ B. From there, we can use a theorem from [42] to estimate the number of elements in the intersection.…”
Section: A Proofsmentioning
confidence: 99%
“…Thus we can use Lemma 4.1 from [42] to estimate the number of objects hashed into it: |SET BF ∩ | = m * 1 − (1 − 1/m) kn , where n = E(|A ∩ B|). Combining that with equation 10, we get an estimation for E(|A ∩ B|):…”
Section: A Proofsmentioning
confidence: 99%
“…The bloomjoin algorithm is combined with a group-by operation and extended to multi-way joins in [10]. More recent studies [14,18] optimizes complex distributed multiway joins using this algorithm. However, while all these work assumes the join relations are not partitioned, in MapReduce environment, a dataset are split and distributed in many nodes.…”
Section: Bloom Filtermentioning
confidence: 99%