A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Huijben, Iris A. M.; Kool, Wouter; Paulus, Max B.; Sloun, Ruud J. G. van

doi:10.1109/tpami.2022.3157042

Cited by 32 publications

(22 citation statements)

References 99 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where u k ∼ Uniform(0, 1), β is a noise scaling parameter [16], τ is the logit temperature and ε is a small numeric constant to avoid numeric issues. In our case, we fixed β = 2 and τ = 2 for all experiments.…”

Section: Methodsmentioning

confidence: 99%

“…Optimization through discrete structures. Optimization of neural networks having discrete structures has been long utilized in the past using either Gumbel relaxations or a straight-through Gumbel-Softmax trick [17], [16]. In terms of estimating the discrete structure of a graph, Franceschi et al [14] proposed a meta-learning method to jointly learn the graph structure and the underlying parameters of the classification model.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning Gait Representations with Noisy Multi-Task Learning

Cosma

Radoi

2022

Sensors

View full text Add to dashboard Cite

Gait analysis is proven to be a reliable way to perform person identification without relying on subject cooperation. Walking is a biometric that does not significantly change in short periods of time and can be regarded as unique to each person. So far, the study of gait analysis focused mostly on identification and demographics estimation, without considering many of the pedestrian attributes that appearance-based methods rely on. In this work, alongside gait-based person identification, we explore pedestrian attribute identification solely from movement patterns. We propose DenseGait, the largest dataset for pretraining gait analysis systems containing 217K anonymized tracklets, annotated automatically with 42 appearance attributes. DenseGait is constructed by automatically processing video streams and offers the full array of gait covariates present in the real world. We make the dataset available to the research community. Additionally, we propose GaitFormer, a transformer-based model that after pretraining in a multi-task fashion on DenseGait, achieves 92.5% accuracy on CASIA-B and 85.33% on FVG, without utilizing any manually annotated data. This corresponds to a +14.2% and +9.67% accuracy increase compared to similar methods. Moreover, GaitFormer is able to accurately identify gender information and a multitude of appearance attributes utilizing only movement patterns. The code to reproduce the experiments is made publicly.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Learning Gait Representations with Noisy Multi-Task Learning

Cosma

Radoi

2022

Sensors

View full text Add to dashboard Cite

show abstract

“…One challenge for fitting this model is that the likelihood function is not smooth because of the minimum structure. We could use the so-called gumbel max trick with a tuning parameter a (Huijben et al, 2021)…”

Section: The Min-linear Lrmentioning

confidence: 99%

Identify local limiting factors of species distribution using min-linear logistic regression

Bu¹,

Shen²

2023

Preprint

View full text Add to dashboard Cite

Logistic regression is a commonly used building block in ecological modeling, but its additive structure among environmental predictors often assumes compensatory relationships between predictors, which can lead to problematic results. In reality, the distribution of species is often determined by the least-favored factor, according to von Liebig's Law of the Minimum, which is not addressed in modeling. To address this issue, we introduced the min-linear logistic regression model, which has a built-in minimum structure of competing factors. In our empirical analysis of the distribution of Asiatic black bears (Ursus thibetanus), we found that the min-linear model performs well compared to other methods and has several advantages. By using the model, we were able to identify ecologically meaningful limiting factors on bear distribution across the survey area. The model's inherent simplicity and interpretability make it a promising tool for extending into other widely used ecological models.

show abstract

“…orderings of S K . While exploiting the Gumbel-max trick can bring down computation to O(2 K ) [24,32], exact computation remains untractable for any practical application. Luckily, numerical methods to efficiently approximate p θ (S K ) exist [32], and, as we show in Sec.…”

Section: Neighborhood Sampling and Sample Likelihoodmentioning

confidence: 99%

Sparse Graph Learning for Spatiotemporal Time Series

Cini¹,

Zambon²,

Alippi³

2022

Preprint

View full text Add to dashboard Cite

Outstanding achievements of graph neural networks for spatiotemporal time series prediction show that relational constraints introduce a positive inductive bias into neural forecasting architectures. Often, however, the relational information characterizing the underlying data generating process is unavailable; the practitioner is then left with the problem of inferring from data which relational graph to use in the subsequent processing stages. We propose novel, principled -yet practicalprobabilistic methods that learn the relational dependencies by modeling distributions over graphs while maximizing, at the same time, end-to-end the forecasting accuracy. Our novel graph learning approach, based on consolidated variance reduction techniques for Monte Carlo score-based gradient estimation, is theoretically grounded and effective. We show that tailoring the gradient estimators to the graph learning problem allows us also for achieving state-of-the-art forecasting performance while controlling, at the same time, both the sparsity of the learned graph and the computational burden. We empirically assess the effectiveness of the proposed method on synthetic and real-world benchmarks, showing that the proposed solution can be used as a stand-alone graph identification procedure as well as a learned component of an end-to-end forecasting architecture.Preprint. Under review.

show abstract

A Review of the Gumbel-max Trick and its Extensions for Discrete Stochasticity in Machine Learning

Cited by 32 publications

References 99 publications

Learning Gait Representations with Noisy Multi-Task Learning

Learning Gait Representations with Noisy Multi-Task Learning

Identify local limiting factors of species distribution using min-linear logistic regression

Sparse Graph Learning for Spatiotemporal Time Series

Contact Info

Product

Resources

About