2023
DOI: 10.1109/tpami.2022.3157042
|View full text |Cite
|
Sign up to set email alerts
|

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Abstract: published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
22
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 32 publications
(22 citation statements)
references
References 99 publications
0
22
0
Order By: Relevance
“…where u k ∼ Uniform(0, 1), β is a noise scaling parameter [16], τ is the logit temperature and ε is a small numeric constant to avoid numeric issues. In our case, we fixed β = 2 and τ = 2 for all experiments.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…where u k ∼ Uniform(0, 1), β is a noise scaling parameter [16], τ is the logit temperature and ε is a small numeric constant to avoid numeric issues. In our case, we fixed β = 2 and τ = 2 for all experiments.…”
Section: Methodsmentioning
confidence: 99%
“…Optimization through discrete structures. Optimization of neural networks having discrete structures has been long utilized in the past using either Gumbel relaxations or a straight-through Gumbel-Softmax trick [17], [16]. In terms of estimating the discrete structure of a graph, Franceschi et al [14] proposed a meta-learning method to jointly learn the graph structure and the underlying parameters of the classification model.…”
Section: Introductionmentioning
confidence: 99%
“…One challenge for fitting this model is that the likelihood function is not smooth because of the minimum structure. We could use the so-called gumbel max trick with a tuning parameter a (Huijben et al, 2021)…”
Section: The Min-linear Lrmentioning
confidence: 99%
“…orderings of S K . While exploiting the Gumbel-max trick can bring down computation to O(2 K ) [24,32], exact computation remains untractable for any practical application. Luckily, numerical methods to efficiently approximate p θ (S K ) exist [32], and, as we show in Sec.…”
Section: Neighborhood Sampling and Sample Likelihoodmentioning
confidence: 99%