2013
DOI: 10.1214/13-aos1140
|View full text |Cite
|
Sign up to set email alerts
|

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

Abstract: We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed dist… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
503
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 490 publications
(504 citation statements)
references
References 20 publications
1
503
0
Order By: Relevance
“…Firstly, there is no reason to stick to only the Euclidean norm · 2 to measure distances for ED-the test can be extended to other norms, and in fact also other metrics; Reference [40] explains the details for the closely related independence testing problem. Following that, Reference [41] discusses the relationship between distances and kernels (again for independence testing, but the same arguments also hold in the two-sample testing setting). Loosely speaking, for every kernel k, there exists a metric d (and also vice versa), given by d(x, y) := (k(x, x) + k(y, y))/2 − k(x, y), such that MMD with kernel k equals ED with metric d. This is a very strong connection between these two families of tests-the energy distance is a special case of the kernel MMD, corresponding to a particular choice of kernel, and the kernel MMD itself corresponds to an extremely smoothed Wasserstein distance, for a particular choice of distance.…”
Section: From Energy Distance To Kernel Maximum Mean Discrepancymentioning
confidence: 99%
“…Firstly, there is no reason to stick to only the Euclidean norm · 2 to measure distances for ED-the test can be extended to other norms, and in fact also other metrics; Reference [40] explains the details for the closely related independence testing problem. Following that, Reference [41] discusses the relationship between distances and kernels (again for independence testing, but the same arguments also hold in the two-sample testing setting). Loosely speaking, for every kernel k, there exists a metric d (and also vice versa), given by d(x, y) := (k(x, x) + k(y, y))/2 − k(x, y), such that MMD with kernel k equals ED with metric d. This is a very strong connection between these two families of tests-the energy distance is a special case of the kernel MMD, corresponding to a particular choice of kernel, and the kernel MMD itself corresponds to an extremely smoothed Wasserstein distance, for a particular choice of distance.…”
Section: From Energy Distance To Kernel Maximum Mean Discrepancymentioning
confidence: 99%
“…An important generalization of distance correlation is Sejdinovic et al (2013). This is related to a generalized distance correlation where the distance is a more general metric than the Euclidean one.…”
Section: New Axiomsmentioning
confidence: 99%
“…K is called the distance kernel [17]. The map ϕ : X → H k , ϕ(x) : x → K(·, x) is the canonical feature map.…”
Section: Appendix A: Proof Of Theoremmentioning
confidence: 99%
“…Energy distance may be interpreted as a special case of MMD with a particular kernel function [17], and thus our contribution in this paper can be regarded as providing a practical choice of the kernel function to the MMD-based method. Since the proposed method does not have any tuning parameter, it is extremely simple and computationally highly efficient.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation