2020
DOI: 10.1007/s41060-020-00214-4
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised extra trees: a stochastic approach to compute similarities in heterogeneous data

Abstract: In this paper we present a method to compute similarities on unlabeled data, based on extremely randomized trees. The main idea of our method, Unsupervised Extremely Randomized Trees (UET) is to randomly split the data in an iterative fashion until a stopping criterion is met, and to compute a similarity based on the co-occurrence of samples in the leaves of each generated tree. Using a tree-based approach to compute similarities is interesting, as the inherent We evaluate our method on synthetic and real-worl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…The number of trees was set to 4000, the number of features randomly selected at each node was √ K and the max depth was reached when the node was pure or when it had less than two units; all other tunable parameters were set to their default values.6.UET: UET is based on ET with labels that are randomly generated [10]. A single feature without replacement was selected at each node and n tree was set to 10,000.…”
Section: Evaluation Of Tarf In Real‐world Datasetsmentioning
confidence: 99%
See 3 more Smart Citations
“…The number of trees was set to 4000, the number of features randomly selected at each node was √ K and the max depth was reached when the node was pure or when it had less than two units; all other tunable parameters were set to their default values.6.UET: UET is based on ET with labels that are randomly generated [10]. A single feature without replacement was selected at each node and n tree was set to 10,000.…”
Section: Evaluation Of Tarf In Real‐world Datasetsmentioning
confidence: 99%
“…In particular, when k = 1, the structure of the tree and the labels in the sample data are independent. Dalleau et al [10,11] adapted ET to a setting in which labels are unknown, calling their method UET. The source code and installation instructions can be found on Gitlab (https://gitlab.inria.fr/kdalleau/uetcpp).…”
Section: Background: Et and Uetmentioning
confidence: 99%
See 2 more Smart Citations
“…Those methods optimize similarity measures by linear or non-linear aggregation functions in a data-independent manner, which cannot depict the complex attribute coupling relationships and heterogeneity among attributes [5]. Accordingly, a more promising but challenging approach is to capture the heterogeneity of attributes [11,26,82] and learn data-aware similarity metrics [47,64] to capture feature couplings [9,50,85].…”
Section: Challenges Of High-dimensional and Heterogeneous Datamentioning
confidence: 99%