2018
DOI: 10.1007/s10115-018-1205-y
|View full text |Cite
|
Sign up to set email alerts
|

A comprehensive empirical comparison of hubness reduction in high-dimensional spaces

Abstract: Hubness is an aspect of the curse of dimensionality related to the distance concentration effect. Hubs occur in high-dimensional data spaces as objects that are particularly often among the nearest neighbors of other objects. Conversely, other data objects become antihubs, which are rarely or never nearest neighbors to other objects. Many machine learning algorithms rely on nearest neighbor search and some form of measuring distances, which are both impaired by high hubness. Degraded performance due to hubness… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

2
6

Authors

Journals

citations
Cited by 20 publications
(20 citation statements)
references
References 39 publications
0
20
0
Order By: Relevance
“…The term Hubness was first described in MIR (Pachet and Aucouturier, 2004), and is now acknowledged as a general machine learning problem and a new aspect of the curse of dimensionality (Radovanović et al, 2010;Schnitzer et al, 2012;Feldbauer and Flexer, 2019). Additionally, hubness is also a relevant problem in recommendation systems using collaborative filtering (Knees et al, 2014), a method which is very common in recommendation systems (Schedl, 2019).…”
Section: Related Work and New Contributionsmentioning
confidence: 99%
“…The term Hubness was first described in MIR (Pachet and Aucouturier, 2004), and is now acknowledged as a general machine learning problem and a new aspect of the curse of dimensionality (Radovanović et al, 2010;Schnitzer et al, 2012;Feldbauer and Flexer, 2019). Additionally, hubness is also a relevant problem in recommendation systems using collaborative filtering (Knees et al, 2014), a method which is very common in recommendation systems (Schedl, 2019).…”
Section: Related Work and New Contributionsmentioning
confidence: 99%
“…Multiple hubness reduction algorithms have been developed to mitigate these effects (Flexer & Schnitzer, 2013;Hara, Suzuki, Kobayashi, Fukumizu, & Radovanovic, 2016;Schnitzer, Flexer, Schedl, & Widmer, 2012). We compared these algorithms exhaustively in a recent survey (Feldbauer & Flexer, 2019), and developed approximate hubness reduction methods with linear time and memory complexity (Feldbauer et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…22 This package offers 4 methods for reducing hubness, that produce a hub-corrected k-NN graph: Mutual Proximity (MP), Local Scaling (LS) and its variant LS-NICDM (Non-Iterative Contextual Dissimilarity Measure) and Dis-SimLocal (DSL). 42,24 Mutual Proximity models pairwise distances d i,j2{1,...,n}\i of a set of n points with random variables X i that depict the distribution of distances between x i and all other points, then:…”
Section: Hubness Reduction Methodsmentioning
confidence: 99%
“…We used three bulk datasets from The Cancer Genome Atlas (TCGA) and ARCHS 4 repositories and added zeros in order to simulate the dropout effect, either in a simple manner, distributing the excess zeros uniformly in the expression matrix, or using Splatter 23 (see Methods). We used previously developed tools to quantify the magnitude of the hubness phenomenon, 24,25 and we also evaluated the asymmetry of the k-NN graph, which we show to be an informative measure of hubness (see Methods). We worked on the PCA-transformed data, changing the number of Principal Components (PCs) to increase or decrease the data dimensionality.…”
Section: Rna-seq Data Is Prone To Hubnessmentioning
confidence: 99%