2018
DOI: 10.48550/arxiv.1809.10455
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Statistical dependence: Beyond Pearson's $ρ$

Dag Tjøstheim,
Håkon Otneim,
Bård Støve

Abstract: Pearson's ρ is the most used measure of statistical dependence. It gives a complete characterization of dependence in the Gaussian case, and it also works well in some non-Gaussian situations. It is well known, however, that it has a number of shortcomings; in particular for heavy tailed distributions and in nonlinear situations, where it may produce misleading, and even disastrous results. In recent years a number of alternatives have been proposed. In this paper, we will survey these developments, especially… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 110 publications
(180 reference statements)
0
5
0
Order By: Relevance
“…While independence between two features implies that the PCC is zero, the converse is generally false. The PCC, which is often used to analyze dependence, only tracks linear correlations and has other shortcomings such as sensitivity to outliers [113]. Any type of dependence between features can have a strong impact on the interpretation of the results of IML methods (see Sect.…”
Section: Confusing Linear Correlation With General Dependencementioning
confidence: 99%
See 1 more Smart Citation
“…While independence between two features implies that the PCC is zero, the converse is generally false. The PCC, which is often used to analyze dependence, only tracks linear correlations and has other shortcomings such as sensitivity to outliers [113]. Any type of dependence between features can have a strong impact on the interpretation of the results of IML methods (see Sect.…”
Section: Confusing Linear Correlation With General Dependencementioning
confidence: 99%
“…Kernel-based measures, such as kernel canonical correlation analysis (KCCA) [6] or the Hilbert-Schmidt independence criterion (HSIC) [44], are commonly used. They have a solid theoretical foundation, are computationally feasible, and robust [113]. In addition, there are information-theoretical measures, such as (conditional) mutual information [24] or the maximal information coefficient (MIC) [93], that can however be difficult to estimate [9,116].…”
Section: Confusing Linear Correlation With General Dependencementioning
confidence: 99%
“…Examples for spaces of negative type are 𝐿 𝑝 spaces for 1 ≤ 𝑝 ≤ 2, ultrametric spaces, and weighted trees (Meckes 2013, eorem 3.6) Spearman's 𝜌, or Kendall's 𝜏), the statistical literature on how to best measure dependency in applications quickly becomes immensely broad and sca ered. Besides the approaches mentioned above, suggestions include maximal correlation coe cients (Gebelein 1941;Koyak 1987), rank or copula based methods (Schweizer and Wol 1981;De e et al 2013;Marti et al 2017), or various measures acting on the distribution of pairwise distances (Friedman and Rafsky 1983;Heller et al 2013), to only cite a few (for a more complete survey of established methods, see Tjøstheim et al 2018). Recently, optimal transport maps have been utilized to de ne multivariate rank statistics that allow for (asymptotically) distribution-free independence tests (Ghosal and Sen 2019;Shi et al 2020;Shi et al 2021).…”
Section: Mutual Informationmentioning
confidence: 99%
“…There have been many methods employed and proposed, see e.g. [24], [43] and [29] for recent surveys. Usually these focus on the (functional) dependence of pairs of variables.…”
Section: Introductionmentioning
confidence: 99%