2021
DOI: 10.1016/j.knosys.2021.107567
|View full text |Cite
|
Sign up to set email alerts
|

Learning with Hilbert–Schmidt independence criterion: A review and new perspectives

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 19 publications
(11 citation statements)
references
References 137 publications
0
11
0
Order By: Relevance
“…2.4 and 2.5 to inner products from reproducing kernel Hilbert spaces. See [46,47] for more details. HSIC is formulated as follows:…”
Section: Similarity Between Representationsmentioning
confidence: 99%
See 1 more Smart Citation
“…2.4 and 2.5 to inner products from reproducing kernel Hilbert spaces. See [46,47] for more details. HSIC is formulated as follows:…”
Section: Similarity Between Representationsmentioning
confidence: 99%
“…The original purpose of HSIC was to determine the statistical independence of two sets of variables but has since been used for various machine learning problems such as feature selection, clustering, dimensionality reduction, and kernel optimization [47].…”
Section: Similarity Between Representationsmentioning
confidence: 99%
“…where K and L are kernel matrices derived from a set of input data, and HSIC is the Hilbert-Schmidt independence criterion (HSIC) that is used to compute a statistical dependence between two matrix kernels [45]. Apparently, CKA is a normalized version of HSIC that is invariant to uniform scaling.…”
Section: Experience Sharingmentioning
confidence: 99%
“…To approach the first question, we leverage the interplay of Hilbert-Schmidt independence criterion (HSIC) and orthogonal projection, hence the name HSIC-Bottleneck Orthogonalization (HBO). Taking a close look at both: HSIC is a non-parametric kernel-based technique utilized to assess the statistical (in)dependence of different layers, which has been widely adopted for various learning tasks (Wang, Dai, and Liu 2021) but is under-investigated in CL community (Wang et al 2023); And a basic idea behind the orthogonal projection is to regularize gradient update directions that do not disturb the weights of previous tasks (Zeng et al 2019). Based on them, the introduced HBO implements non-overwritten parameter updates facilitated by the HSIC-bottleneck training in an orthogonal space, where one can exploit readily available gradient updates by measuring nonlinear dependencies between the inputs and outputs.…”
Section: Introductionmentioning
confidence: 99%