2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA) 2016
DOI: 10.1109/aina.2016.37
|View full text |Cite
|
Sign up to set email alerts
|

eHSim: An Efficient Hybrid Similarity Search with MapReduce

Abstract: In this paper, we study the problems of scalability and performance for similarity search by proposing eHSim, an efficient hybrid similarity search with MapReduce. More specifically, we introduce clustering schemes that partition objects into different groups by their length. Additionally, we equip our proposed schemes with pruning strategies that quickly discard irrelevant objects before truly computing their similarity. Moreover, we design a hybrid MapReduce architecture that deals with challenges from big d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…They focus on the load balancing and how to avoid unnecessary object pairs with their filtering methods, including the range-object filtering, the double-pivot filtering, the pivot filtering, and the plane sweeping techniques, so that they can achieve better query performance. In the meantime, Phan et al proposed an efficient hybrid similarity search with MapReduce [10]. Their basic idea is to first cluster similar objects and second define upper and lower boundaries to shrink the search space before looking at similar pairs.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…They focus on the load balancing and how to avoid unnecessary object pairs with their filtering methods, including the range-object filtering, the double-pivot filtering, the pivot filtering, and the plane sweeping techniques, so that they can achieve better query performance. In the meantime, Phan et al proposed an efficient hybrid similarity search with MapReduce [10]. Their basic idea is to first cluster similar objects and second define upper and lower boundaries to shrink the search space before looking at similar pairs.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, to speed up the process of similarity search, different types of indexing are employed. One of the most well-known indexing supporting similarity search is the inverted index, a popular data structure used in information retrieval systems [2,10]. In our work, we build another version of inverted index known as the sorted inverted index so that we can skip unnecessary computing for those elements not in either document or query objects when searching candidate pairs, which is discussed later on in Sect.…”
Section: Similarity Searchmentioning
confidence: 99%
See 3 more Smart Citations