2003
DOI: 10.1007/978-3-540-45227-0_50
|View full text |Cite
|
Sign up to set email alerts
|

Supporting KDD Applications by the k-Nearest Neighbor Join

Abstract: Abstract. The similarity join has become an important database primitive to support similarity search and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Well-known are two types of the similarity join, the distance range join where the user defines a distance threshold for the join, and the closest point query or k-distance join which retrieves the k most similar pairs. In this paper, we propose an important, third similarity join… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
22
0
2

Year Published

2005
2005
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 25 publications
(24 citation statements)
references
References 17 publications
0
22
0
2
Order By: Relevance
“…A clustering algorithm based on closest pairs has been proposed in [13]. In [2,3] the authors study applications of the k-NN join operation to knowledge discovery, which is a direct extension of the k-Semi-Closest-Pair query. More specifically, the authors discuss the application of k-NN join to clustering, classification and sampling tasks in data mining operations, and they illustrate how these tasks can be performed more efficiently.…”
Section: Given Two Spatial Datasets Dmentioning
confidence: 99%
“…A clustering algorithm based on closest pairs has been proposed in [13]. In [2,3] the authors study applications of the k-NN join operation to knowledge discovery, which is a direct extension of the k-Semi-Closest-Pair query. More specifically, the authors discuss the application of k-NN join to clustering, classification and sampling tasks in data mining operations, and they illustrate how these tasks can be performed more efficiently.…”
Section: Given Two Spatial Datasets Dmentioning
confidence: 99%
“…A related problem, called AkNN, which reports the kNN for each data point, is directly used in the JarvisPatrick Clustering algorithm [16]. AkNN is also used in a number of other clustering algorithms including the kmeans and the k-medoid clustering algorithms [4].…”
Section: Introductionmentioning
confidence: 99%
“…In many applications that use ANN, especially large scientific applications, the datasets are growing rapidly and often the ANN computation is one of the main computational bottlenecks. Recognizing this problem, there has been a lot of interest in the database community in developing efficient external ANN algorithms [4,5,9,13,32]. All of these methods build R*-tree indices [3] on one or both datasets, and evaluate the ANN by traversing the index.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…A clustering algorithm based on closest pairs has been proposed in [12]. In [2,3] the authors study applications of the k-NN join operation to knowledge discovery, which is a direct extension of the k-semi-closest-pair query. More specifically, the authors discuss the application of k-NN join to clustering, classification and sampling tasks in data mining operations, and they illustrate how these tasks can be performed more efficiently.…”
Section: Introductionmentioning
confidence: 99%