2021
DOI: 10.48550/arxiv.2112.10510
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Transformers Can Do Bayesian Inference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 0 publications
0
5
0
Order By: Relevance
“… 26 Unlike meta-learning, which is centered around enhancing the learning process, PFNs are pretrained on synthetic data to approximate Bayesian inference on new data. 43 Bayesian inference is a statistical method that allows for the quantification and management of uncertainty, a crucial feature in clinical prognosis scenarios. 44 The pretraining on synthetic data enables TabPFN to adeptly navigate complex patterns within real-world data, showcasing a nuanced approach to tabular data handling.…”
Section: Discussionmentioning
confidence: 99%
“… 26 Unlike meta-learning, which is centered around enhancing the learning process, PFNs are pretrained on synthetic data to approximate Bayesian inference on new data. 43 Bayesian inference is a statistical method that allows for the quantification and management of uncertainty, a crucial feature in clinical prognosis scenarios. 44 The pretraining on synthetic data enables TabPFN to adeptly navigate complex patterns within real-world data, showcasing a nuanced approach to tabular data handling.…”
Section: Discussionmentioning
confidence: 99%
“…There is a long line of work investigating the capabilities [Vaswani et al, 2017, Dehghani et al, 2018, Yun et al, 2019, Pérez et al, 2019, Yao et al, 2021, Bhattamishra et al, 2020b, Zhang et al, 2022, limitations [Hahn, 2020, Bhattamishra et al, 2020a, applications [Lu et al, 2021a, Dosovitskiy et al, 2020, Parmar et al, 2018, and internal workings [Elhage et al, 2021, Snell et al, 2021, Weiss et al, 2021, Edelman et al, 2022, Olsson et al, 2022 of Transformer models. Most similar to our work, Müller et al [2021] introduce a "Prior-data fitted transformer network" that is trained to approximate Bayesian inference and generate predictions for downstream learning problems. However, while their focus is on performing Bayesian inference faster than existing methods (e.g., MCMC) and using their network for downstream tasks (with or without parameter fine-tuning), we focus on formalizing and understanding in-context learning through simple function classes.…”
Section: Related Workmentioning
confidence: 99%
“…Decision-Pretrained Transformer. DPT is an alternative approach inspired by the Bayesian inference approximation (Müller et al, 2021). Unlike AD, it trains a transformer to predict the optimal action for a query state given a random, task specific, context.…”
Section: In-context Reinforcement Learningmentioning
confidence: 99%