2018
DOI: 10.1145/3231936
|View full text |Cite
|
Sign up to set email alerts
|

Binary Sketches for Secondary Filtering

Abstract: This article addresses the problem of matching the most similar data objects to a given query object. We adopt a generic model of similarity that involves the domain of objects and metric distance functions only. We examine the case of a large dataset in a complex data space, which makes this problem inherently difficult. Many indexing and searching approaches have been proposed, but they have often failed to efficiently prune complex search spaces and access large portions of the dataset when evaluating queri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1
1

Relationship

3
4

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…Although a number of existing solutions employ some form of indexing, no comparative studies have been conducted to assess the efficiency and scalability of the proposed methods. There is also little work on using approximate retrieval strategies to limit the number of database objects that need to be accessed [121]. In both these aspects, the motion processing-initially cultivated in the computer vision community-could benefit from a closer cooperation with the database, information retrieval, multimedia, and data mining communities.…”
Section: ) Searching and Filteringmentioning
confidence: 99%
“…Although a number of existing solutions employ some form of indexing, no comparative studies have been conducted to assess the efficiency and scalability of the proposed methods. There is also little work on using approximate retrieval strategies to limit the number of database objects that need to be accessed [121]. In both these aspects, the motion processing-initially cultivated in the computer vision community-could benefit from a closer cooperation with the database, information retrieval, multimedia, and data mining communities.…”
Section: ) Searching and Filteringmentioning
confidence: 99%
“…that expresses the dissimilarity of objects o ∈ D. We consider the data set S ⊆ D, and the so-called kNN queries that search for the k closest objects from S to a query object q ∈ D. Similarity queries are often evaluated in an approximate manner since the slightly imprecise results are sufficient in many real-life applications and they can be delivered significantly faster than the precise ones. Many metric space transformations have been proposed to speed-up the approximate similarity searching, including those producing the Hamming space [4,5,11,18,19], Euclidean space [9,16] and Permutation space [1,6,20]. We further restrict our attention to the metric embedding into the Hamming space.…”
Section: Background and Related Workmentioning
confidence: 99%
“…The broad applicability of the metric space similarity model makes the metric search a challenging task, since the distance function is the only operation that can be exploited to compare two objects. One way to speed-up the metric searching is to transform the space to use a cheaper similarity function or to reduce data object sizes [4,9,14,19]. Recently, Connor et al proposed the n-Simplex projection that transforms the metric space into a finite-dimensional Euclidean space [8,9].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Many existing retrieval techniques [2,18,19,24] focus solely on search quality and do not discuss the efficiency at all, which leads to expensive sequential scan over the whole dataset. The efficiencyoriented works either propose very compact features that allow fast sequential scanning [12,13], or utilize various indexing schemes to organize the motion data (e.g., the binary tree [25], kd tree [9], R* tree [4], inverted file index [14], or tries [8]). To optimize the efficiency-effectiveness trade-off, a two-phase retrieval model is often used, where the candidate objects identified within an efficient search phase are submitted to a re-ranking phase that refines the result using more expensive techniques (e.g., traversal of a graph structure [9] or ranking by the Dynamic Time Warping [14,20]).…”
Section: Introductionmentioning
confidence: 99%