2022
DOI: 10.48550/arxiv.2202.05780
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Modern Self-Referential Weight Matrix That Learns to Modify Itself

Abstract: The weight matrix (WM) of a neural network (NN) is its program. The programs of many traditional NNs are learned through gradient descent in some error function, then remain fixed. The WM of a self-referential NN, however, can keep rapidly modifying all of itself during runtime. In principle, such NNs can meta-learn to learn, and metameta-learn to meta-learn to learn, and so on, in the sense of recursive self-improvement. While NN architectures potentially capable of implementing such behavior have been propos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 21 publications
0
4
0
Order By: Relevance
“…The meta-representation refers to the representation of meta-knowledge ω. This knowledge could be anything from initial model parameters (Finn et al, 2017;Rothfuss et al, 2018;Fakoor et al, 2019;Liu et al, 2019), the inner optimization process (Andrychowicz et al, 2016;Bello et al, 2017;Metz et al, 2018;Irie et al, 2022), or the model architecture (Zoph and Le, 2016;Liu et al, 2018;Lian et al, 2019;Real et al, 2019). The meta-optimizer refers to the choice of optimization for the outer-level in the meta-training phase which updates meta-knowledge ω.…”
Section: Learning In Network With Plastic Synapses Learning How To Le...mentioning
confidence: 99%
“…The meta-representation refers to the representation of meta-knowledge ω. This knowledge could be anything from initial model parameters (Finn et al, 2017;Rothfuss et al, 2018;Fakoor et al, 2019;Liu et al, 2019), the inner optimization process (Andrychowicz et al, 2016;Bello et al, 2017;Metz et al, 2018;Irie et al, 2022), or the model architecture (Zoph and Le, 2016;Liu et al, 2018;Lian et al, 2019;Real et al, 2019). The meta-optimizer refers to the choice of optimization for the outer-level in the meta-training phase which updates meta-knowledge ω.…”
Section: Learning In Network With Plastic Synapses Learning How To Le...mentioning
confidence: 99%
“…The meta-representation refers to the representation of meta-knowledge ω. This knowledge could be anything from initial model parameters (25)(26)(27)(28), the inner optimization process (29)(30)(31)(32), or the model architecture (33)(34)(35)(36). The meta-optimizer refers to the choice of optimization for the outer-level in the meta-training phase which updates meta-knowledge ω.…”
Section: Learning In Network With Plastic Synapsesmentioning
confidence: 99%
“…We visualize their attention maps in Section VI-B. The observed generalization gap makes environments with large differences in object distributions (requiring zero-shot adaptation) fruitful for developing and evaluating novel fastadaptation [44]- [48] and meta-learning [49] agents, such as the ones based on Fast Weight Programmers [50]- [54].…”
Section: Ood Generalization Experimentsmentioning
confidence: 99%