On Feature Decorrelation in Self-Supervised Learning

Hua, Tianyu; Wang, Wenxiao; Xue, Zihui; Ren, Sucheng; Wang, Yue; Zhao, Hang

doi:10.48550/arxiv.2105.00470

Cited by 6 publications

(11 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Together, these results suggest that each of the three components of LPL is crucial for learning disentangled representations in hierarchical DNNs. Two common causes for failure to disentangle representations are representational collapse and dimensional collapse, which results from excessively high correlations between neurons [49,50]. To disambiguate between these two possibilities, we computed the dimensionality of the output representations and the mean neuronal activity at every layer (Methods).…”

Section: Lpl Disentangles Representations In Deep Hierarchical Networkmentioning

confidence: 99%

The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks

Halvagal

Zenke

2022

Preprint

View full text Add to dashboard Cite

Discriminating distinct objects and concepts from sensory stimuli is essential for survival. Our brains accomplish this feat by forming meaningful internal representations in deep sensory networks with plastic synaptic connections. Experience-dependent plasticity presumably exploits temporal contingencies between sensory inputs to build these internal representations. However, the precise mechanisms underlying plasticity remain elusive. We derive a local synaptic plasticity model inspired by self-supervised machine learning techniques that shares a deep conceptual connection to Bienenstock-Cooper-Munro (BCM) theory and is consistent with experimentally observed plasticity rules. We show that our plasticity model yields disentangled object representations in deep neural networks without the need for supervision and implausible negative examples. In response to altered visual experience, our model qualitatively captures neuronal selectivity changes observed in the monkey inferotemporal cortex in-vivo. Our work suggests a plausible learning rule to drive learning in sensory networks while making concrete testable predictions.

show abstract

Section: Lpl Disentangles Representations In Deep Hierarchical Networkmentioning

confidence: 99%

The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks

Halvagal

Zenke

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…In this section we provide an illustration and some discussions for degenerated (collapsed) solutions, or namely trivial solutions, in self-supervised representation learning. The discussion is inspired by the separation of complete collapse and dimensional collapse proposed in [19]. We show that our method naturally avoids complete collapse through feature-wise normalization, and could prevent/alleviate dimensional collapse through the decorrelation term Eq.…”

Section: B Discussion On Degenerated Solutions In Sslmentioning

confidence: 96%

“…( 19) will lead to trivial solutions: all the embeddings would degenerate to a fixed point on the hypersphere. This phenomenon is called complete collapse [19]. Denote Z A and Z B as two embedding matrix of two views (Z ∈ R N ×D and is row normalized), then in this case Z A Z B ∼ = 1 is an all-one matrix (so as…”

Section: B Discussion On Degenerated Solutions In Sslmentioning

confidence: 99%

“…Another kind of collapse that has been neglected by most existing works is dimensional collapse [19]. Different from complete collapse where all the data points degenerate into a single point, dimensional collapse means data points are distributed on a line, and each dimension captures exactly the same features (or different dimensions are highly correlated can capture the same information).…”

Section: B Discussion On Degenerated Solutions In Sslmentioning

confidence: 99%

“…More specifically, the new objective aims at maximizing the correlation between two augmented views of the same input and meanwhile decorrelating different (feature) dimensions of a single view's representation. We show that the objective 1) essentially pursuits discarding augmentation-variant information and preserving augmentation-invariant information, and 2) can prevent dimensional collapse [19] (i.e., different dimensions capture the same information) in nature. Furthermore, our theoretical analysis sheds more lights that under mild assumptions, our model is an instantiation of Information Bottleneck Principle [43,44,37] under SSL settings [53,9,45].…”

Section: Methodsmentioning

confidence: 99%

See 2 more Smart Citations

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Zhang¹,

Wu²,

Yan³

et al. 2021

Preprint

View full text Add to dashboard Cite

We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data. It follows the previous methods that generate two views of an input graph through data augmentation. However, unlike contrastive methods that focus on instance-level discrimination, we optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis. Compared with other works, our approach requires none of the parameterized mutual information estimator, additional projector, asymmetric structures, and most importantly, negative samples which can be costly. We show that the new objective essentially 1) aims at discarding augmentation-variant information by learning invariant representations, and 2) can prevent degenerated solutions by decorrelating features in different dimensions. Our theoretical analysis further provides an understanding for the new objective which can be equivalently seen as an instantiation of the Information Bottleneck Principle under the self-supervised setting. Despite its simplicity, our method performs competitively on seven public graph datasets. The code is available at: https://github.com/hengruizhang98/CCA-SSG.

show abstract

Efficient Training of Visual Transformers with Small Datasets

Liu¹,

Sangineto²,

Bi³

et al. 2021

Preprint

View full text Add to dashboard Cite

Visual Transformers (VTs) are emerging as an architectural paradigm alternative to Convolutional networks (CNNs). Differently from CNNs, VTs can capture global relations between image elements and they potentially have a larger representation capacity. However, the lack of the typical convolutional inductive bias makes these models more data-hungry than common CNNs. In fact, some local properties of the visual domain which are embedded in the CNN architectural design, in VTs should be learned from samples. In this paper, we empirically analyse different VTs, comparing their robustness in a small training-set regime, and we show that, despite having a comparable accuracy when trained on ImageNet, their performance on smaller datasets can be largely different. Moreover, we propose a self-supervised task which can extract additional information from images with only a negligible computational overhead. This task encourages the VTs to learn spatial relations within an image and makes the VT training much more robust when training data are scarce. Our task is used jointly with the standard (supervised) training and it does not depend on specific architectural choices, thus it can be easily plugged in the existing VTs. Using an extensive evaluation with different VTs and datasets, we show that our method can improve (sometimes dramatically) the final accuracy of the VTs. The code will be available upon acceptance.Preprint. Under review.

show abstract

On Feature Decorrelation in Self-Supervised Learning

Cited by 6 publications

References 32 publications

The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks

The combination of Hebbian and predictive plasticity learns invariant object representations in deep sensory networks

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks

Efficient Training of Visual Transformers with Small Datasets

Contact Info

Product

Resources

About