“…Many machine learning methods depend on some measure of pairwise similarity (which is usually unsupervised) including dimensionality reduction methods [17], [18], [19], [20], [21], [22], [23], spectral clustering [24], and any method involving the kernel trick such as SVM [25] and kernel PCA [26]. Random forest proximities can be used to extend many of these problems to a supervised setting and have been used for data visualization [27], [28], [29], [30], [31], outlier detection [30], [32], [33], [34], and data imputation [35], [36], [37], [38].…”