TraDE: Transformers for Density Estimation

Fakoor, Rasool; Chaudhari, Pratik; Mueller, Jonas; Smola, Alexander J.

doi:10.48550/arxiv.2004.02441

Cited by 4 publications

(4 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, these proposed future studies can be expanded from [9], which investigated better neural architecture designs for building flow-based models using self-attention for the estimator. Combined with increasing evidence in other research domains applying similar architecture [17], we expect the self-attention-based estimator to provide more expressive density estimations [7,20], where the attention mechanism could be directly augmented from flow indication embedding. We leave this research direction as future work.…”

Section: Discussionmentioning

confidence: 85%

NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity

Lee,

Kim,

Yoon

2020

Preprint

View full text Add to dashboard Cite

Normalizing flows (NFs) have become a prominent method for deep generative models that allow for an analytic probability density estimation and efficient synthesis. However, a flow-based network is considered to be inefficient in parameter complexity because of reduced expressiveness of bijective mapping, which renders the models prohibitively expensive in terms of parameters. We present an alternative of parameterization scheme, called NanoFlow, which uses a single neural density estimator to model multiple transformation stages. Hence, we propose an efficient parameter decomposition method and the concept of flow indication embedding, which are key missing components that enable density estimation from a single neural network. Experiments performed on audio and image models confirm that our method provides a new parameter-efficient solution for scalable NFs with significantly sublinear parameter complexity.Preprint. Under review.

show abstract

Section: Discussionmentioning

confidence: 85%

NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity

Lee,

Kim,

Yoon

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…In [24] the authors model conditional density estimators for multivariate data with conditional sum-product networks that combines tree-based structures with deep models. In [7] the authors combine the transformer model with flows for density estimation. The flow models were also applied for future prediction problems in [29].…”

Section: Related Workmentioning

confidence: 99%

TreeFlow: Going Beyond Tree-Based Parametric Probabilistic Regression

Wielopolski,

Zięba

2023

Frontiers in Artificial Intelligence and Applications

View full text Add to dashboard Cite

The tree-based ensembles are known for their outstanding performance in classification and regression problems characterized by feature vectors represented by mixed-type variables from various ranges and domains. However, considering regression problems, they are primarily designed to provide deterministic responses or model the uncertainty of the output with Gaussian or parametric distribution. In this work, we introduce TreeFlow, the tree-based approach that combines the benefits of using tree ensembles with the capabilities of modeling flexible probability distributions using normalizing flows. The main idea of the solution is to use a tree-based model as a feature extractor and combine it with a conditional variant of normalizing flow. Consequently, our approach is capable of modeling complex distributions for the regression outputs. We evaluate the proposed method on challenging regression benchmarks with varying volume, feature characteristics, and target dimensionality. We obtain the SOTA results for both probabilistic and deterministic metrics on datasets with multi-modal target distributions and competitive results on unimodal ones compared to tree-based regression baselines.

show abstract

“…Despite their success for modeling text, the application of Transformer architectures to tabular data remains limited [16,28,68]. The use of tabular models together with Transformer-like text architectures has also received little attention [33,53].…”

Section: A3 Featurizing Text For Tabular Modelsmentioning

confidence: 99%

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Shi¹,

Mueller²,

Erickson³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

We consider the use of automated supervised learning systems for data tables that not only contain numeric/categorical columns, but one or more text fields as well.Here we assemble 18 multimodal data tables that each contain some text fields and stem from a real business application. Our publicly-available benchmark 2 enables researchers to comprehensively evaluate their own methods for supervised learning with numeric, categorical, and text features. To ensure that any single modeling strategy which performs well over all 18 datasets will serve as a practical foundation for multimodal text/tabular AutoML, the diverse datasets in our benchmark vary greatly in: sample size, problem types (a mix of classification and regression tasks), number of features (with the number of text columns ranging from 1 to 28 between datasets), as well as how the predictive signal is decomposed between text vs. numeric/categorical features (and predictive interactions thereof). Over this benchmark, we evaluate various straightforward pipelines to model such data, including standard two-stage approaches where NLP is used to featurize the text such that AutoML for tabular data can then be applied. Compared with human data science teams, the fully automated methodology 3 that performed best on our benchmark (stack ensembling a multimodal Transformer with various tree models) also manages to rank 1st place when fit to the raw text/tabular data in two MachineHack prediction competitions and 2nd place (out of 2380 teams) in Kaggle's Mercari Price Suggestion Challenge.

show abstract

TraDE: Transformers for Density Estimation

Cited by 4 publications

References 31 publications

NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity

NanoFlow: Scalable Normalizing Flows with Sublinear Parameter Complexity

TreeFlow: Going Beyond Tree-Based Parametric Probabilistic Regression

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Contact Info

Product

Resources

About