2018
DOI: 10.48550/arxiv.1809.04067
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Zoom: SSD-based Vector Search for Optimizing Accuracy, Latency and Memory

Minjia Zhang,
Yuxiong He

Abstract: With the advancement of machine learning and deep learning, vector search becomes instrumental to many information retrieval systems, to search and find best matches to user queries based on their semantic similarities. These online services require the search architecture to be both effective with high accuracy and efficient with low latency and memory footprint, which existing work fails to offer. We develop, Zoom, a new vector search solution that collaboratively optimizes accuracy, latency and memory based… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…By analyzing memory-disk ANNS algorithms [1][2][3][4] we can find that when ANNS algorithms combined with external disk such as SSD, are caching the data of edges and points into memory index as much as possible to facilitate fast response to search results. We use the DiskANN [1] method for indexing on the SIFT10M dataset, and we use R=64, L=50, and T=16 for parameter selection(R denotes the number of neighbors per vertex in the graph structure, L denotes the number of hops per point when building the index, and T denotes the number of threads used to build the index.…”
Section: Motivationmentioning
confidence: 99%
“…By analyzing memory-disk ANNS algorithms [1][2][3][4] we can find that when ANNS algorithms combined with external disk such as SSD, are caching the data of edges and points into memory index as much as possible to facilitate fast response to search results. We use the DiskANN [1] method for indexing on the SIFT10M dataset, and we use R=64, L=50, and T=16 for parameter selection(R denotes the number of neighbors per vertex in the graph structure, L denotes the number of hops per point when building the index, and T denotes the number of threads used to build the index.…”
Section: Motivationmentioning
confidence: 99%
“…Nearest Neighbor Search (NNS) is a fundamental building block in various application domains [7,8,35,64,67,76,101,110], such as information retrieval [31,111], pattern recognition [26,54], data mining [41,44], machine learning [21,25], and recommendation systems [66,78]. With the explosive growth of datasets' scale and the inevitable curse of dimensionality, accurate NNS cannot meet This work is licensed under the Creative Commons BY-NC-ND 4.0 International License.…”
Section: Introductionmentioning
confidence: 99%