TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Hollmann, Noah; Müller, S.; Eggensperger, Katharina; Hutter, Frank

doi:10.48550/arxiv.2207.01848

Cited by 21 publications

(19 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For interactome signature discovery, GO terms were used to retain functionally coherent proteins per fragments, followed by NMF for fragment-protein matrix decomposition. The ML methodology included pretraining of the FFF descriptor with a blend of topological and physicochemical descriptors, (41,69,70) binary classification with TabPFN models (40,71), and interpretability of promiscuity predictions using Shapley value analysis (42,72,73). A fully automated ML modeler is provided as part of the ligand discovery web resource.…”

Section: Methods Summarymentioning

confidence: 99%

“…In brief, we first labeled screened fragments as promiscuous (1) or nonpromiscuous (0), according to thresholds in protein-interaction counts. Then, we used a transformer-based ML model (TabPFN) to map a compound's FFF descriptor to a classification score (0 or 1) (40). TabPFN is a fully learned model that approximates Bayesian inference and requires no hyperparameter tuning, making it straightforward to obtain performant ML classifiers based on our chemoproteomics profiling data.…”

Section: Fragment Promiscuity Predictionmentioning

confidence: 99%

See 1 more Smart Citation

Large-scale chemoproteomics expedites ligand discovery and predicts ligand behavior in cells

Offensperger,

Tin,

Duran-Frigola

et al. 2024

Science

View full text Add to dashboard Cite

Chemical modulation of proteins enables a mechanistic understanding of biology and represents the foundation of most therapeutics. However, despite decades of research, 80% of the human proteome lacks functional ligands. Chemical proteomics has advanced fragment-based ligand discovery toward cellular systems, but throughput limitations have stymied the scalable identification of fragment-protein interactions. We report proteome-wide maps of protein-binding propensity for 407 structurally diverse small-molecule fragments. We verified that identified interactions can be advanced to active chemical probes of E3 ubiquitin ligases, transporters, and kinases. Integrating machine learning binary classifiers further enabled interpretable predictions of fragment behavior in cells. The resulting resource of fragment-protein interactions and predictive models will help to elucidate principles of molecular recognition and expedite ligand discovery efforts for hitherto undrugged proteins.

show abstract

Section: Methods Summarymentioning

confidence: 99%

Section: Fragment Promiscuity Predictionmentioning

confidence: 99%

Large-scale chemoproteomics expedites ligand discovery and predicts ligand behavior in cells

Offensperger,

Tin,

Duran-Frigola

et al. 2024

Science

View full text Add to dashboard Cite

show abstract

“…Our work is inspired by [17] which studies ICL in synthetic settings and demonstrates transformers can serve as complex classifiers through ICL. In parallel, [19] uses ICL as an AutoML (i.e. model-selection, hyperparameter tuning) framework where they plug in a dataset to transformer and use it as a classifier for new test points.…”

Section: Related Workmentioning

confidence: 99%

“…In this section, we will discuss how ICL can be interpreted as an implicit model selection procedure building on the formalism that transformer is a learning algorithm. Following Figure 2 and prior works [17,24,19], a plausible assumption is that, transformer can implement ERM algorithms up to a certain accuracy. Then, model selection can be formalized by the selection of the right hypothesis class so that running ERM on that hypothesis class can strike a good bias-variance tradeoff during ICL.…”

Section: Interpreting In-context Learning As a Model Selection Proced...mentioning

confidence: 99%

“…Recent works [17,24] demonstrate that ICL can also be used to infer general functional relationships. For instance, [19,17] aims to solve certain supervised learning problems where they feed an entire training dataset (x 𝑖 , 𝑓 (x 𝑖 )) 𝑛−1 𝑖=1 as the input prompt, expecting that conditioning x 𝑡 = Cs 𝑡 and s 𝑡+1 ∼ N (As 𝑡 , 𝜎 2 I) with randomly sampled C, A. Each setting trains a transformer with large number of random regression tasks and evaluates on a new task from the same distribution.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Transformers as Algorithms: Generalization and Stability in In-context Learning

Li¹,

Ildiz²,

Papailiopoulos³

et al. 2023

Preprint

View full text Add to dashboard Cite

In-context learning (ICL) is a type of prompting where a transformer model operates on a sequence of (input, output) examples and performs inference on-the-fly. In this work, we formalize in-context learning as an algorithm learning problem where a transformer model implicitly constructs a hypothesis function at inference-time. We first explore the statistical aspects of this abstraction through the lens of multitask learning: We obtain generalization bounds for ICL when the input prompt is (1) a sequence of i.i.d. (input, label) pairs or (2) a trajectory arising from a dynamical system. The crux of our analysis is relating the excess risk to the stability of the algorithm implemented by the transformer. We characterize when transformer/attention architecture provably obeys the stability condition and also provide empirical verification. For generalization on unseen tasks, we identify an inductive bias phenomenon in which the transfer learning risk is governed by the task complexity and the number of MTL tasks in a highly predictable manner. Finally, we provide numerical evaluations that (1) demonstrate transformers can indeed implement near-optimal algorithms on classical regression problems with i.i.d. and dynamic data, (2) provide insights on stability, and (3) verify our theoretical predictions. Introduction Desired Output In-context learning Input promptNatural language processing berry, baya, apple, manzana, banana

show abstract

Ovarian Cancer Detection with Popular AI Algorithms: A Brief Review

Mercioni,

Holban

2024

IFMBE Proceedings

View full text Add to dashboard Cite

TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second

Cited by 21 publications

References 0 publications

Large-scale chemoproteomics expedites ligand discovery and predicts ligand behavior in cells

Large-scale chemoproteomics expedites ligand discovery and predicts ligand behavior in cells

Transformers as Algorithms: Generalization and Stability in In-context Learning

Ovarian Cancer Detection with Popular AI Algorithms: A Brief Review

Contact Info

Product

Resources

About