2003
DOI: 10.1109/tkde.2003.1198398
|View full text |Cite
|
Sign up to set email alerts
|

CSVD: clustering and singular value decomposition for approximate similarity search in high-dimensional spaces

Abstract: cation. It has been issued as a Research Report for e a rly dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and speci c requests. After outside publication, requests should be lled only by r e p rints or legally obtained copies of the article (e.g.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
65
0

Year Published

2004
2004
2018
2018

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 66 publications
(65 citation statements)
references
References 25 publications
0
65
0
Order By: Relevance
“…The problem of dimensionality reduction has recently received broad attention in areas such as machine learning, computer vision, and information retrieval (Berry, Dumais, & O'Brie, 1995;Castelli, Thomasian, & Li, 2003;Deerwester et al, 1990;Dhillon & Modha, 2001;Kleinberg & Tomkins, 1999;Srebro & Jaakkola, 2003). The goal of dimensionality reduction is to obtain more compact representations of the data with limited loss of information.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The problem of dimensionality reduction has recently received broad attention in areas such as machine learning, computer vision, and information retrieval (Berry, Dumais, & O'Brie, 1995;Castelli, Thomasian, & Li, 2003;Deerwester et al, 1990;Dhillon & Modha, 2001;Kleinberg & Tomkins, 1999;Srebro & Jaakkola, 2003). The goal of dimensionality reduction is to obtain more compact representations of the data with limited loss of information.…”
Section: Introductionmentioning
confidence: 99%
“…The representation of data by vectors in Euclidean space allows one to compute the similarity between data points, based J. YE on the Euclidean distance or some other similarity metric. The similarity metrics on data points naturally lead to similarity-based indexing by representing queries as vectors and searching for their nearest neighbors (Aggarwal, 2001;Castelli, Thomasian, & Li, 2003).…”
Section: Introductionmentioning
confidence: 99%
“…5 Ω, where p is a small oversampling parameter (typically set to [5][6][7][8][9][10]. Multiplying A with the random matrix Ω, we obtain Y = AΩ.…”
Section: Preliminariesmentioning
confidence: 99%
“…From each cluster a concept vector is derived which is later used as a basis to obtain a low rank approximation of A. In [9,41], similarly to [14], clustering is used to partition either the rows or columns of an m × n matrix A. An approximation of the data matrix A is obtained using rank-1 or rank-k i truncated SVD approximations of each row or column cluster.…”
Section: Principal Angles Assume We Have a Truncated Svd Approximatimentioning
confidence: 99%
See 1 more Smart Citation