2022
DOI: 10.1007/978-3-030-99336-8_1
|View full text |Cite
|
Sign up to set email alerts
|

Categorical Foundations of Gradient-Based Learning

Abstract: We propose a categorical semantics of gradient-based machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as MSE and Softmax cross-entropy, shedding new light on their similarities and differences. Our approach to gradient-based learning has examples… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 29 publications
(26 citation statements)
references
References 47 publications
0
18
0
Order By: Relevance
“…Then in particular, by Theorem 4.17, the coKleisli category associated to these models is a CRDC. By the results of [11], this means that one could apply supervised learning techniques to these examples. This possibility of combining quantum computation with supervised learning is an exciting direction we hope will be pursued in the future.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Then in particular, by Theorem 4.17, the coKleisli category associated to these models is a CRDC. By the results of [11], this means that one could apply supervised learning techniques to these examples. This possibility of combining quantum computation with supervised learning is an exciting direction we hope will be pursued in the future.…”
Section: Discussionmentioning
confidence: 99%
“…Specifically, a CRDC is equivalent to giving a CDC in which the subcategory of linear maps in each simple slice has a transpose operator, which categorically speaking is a special type of of dagger structure. The explicit connection with supervised learning was then made in [11], which showed how to describe several supervised-learning techniques in the abstract setting of a CRDC.…”
Section: Introductionmentioning
confidence: 99%
“…In the extended version, we model categorically situations where there is a notion of distance between resources, and instead of exact resource conversions one either studies approximate transformations or sequences of transformations that succeed in the limit. In the extended version, we discuss a variant of a construction on monoidal categories, used in special cases in [31] and discussed in more detail in [23,33], that allows one to declare some resources free and thus enlarge the set of possible resource conversions.…”
Section: Introductionmentioning
confidence: 99%
“…Reverse Derivative Categories [10] have recently been introduced as a formalism to study abstractly the concept of differentiable functions. As explored in [11], it turns out that this framework is suitable to give a categorical semantics for gradient-based learning. In this approach, models-as for instance neural networks-correspond to morphisms in some RDC.…”
Section: Introductionmentioning
confidence: 99%