Tabular Transformers for Modeling Multivariate Time Series

Padhi, Inkit; Schiff, Yair; Melnyk, Igor; Rigotti, Mattia; Mroueh, Youssef; Dognin, Pierre; Ross, Jerret; Nair, Ravi; Altman, Erik R.

doi:10.1109/icassp39728.2021.9414142

Cited by 53 publications

(40 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…TABERT [48], a more elaborate neural approach inspired by the large language transformer model BERT [9], is trained on semi-structured test data to perform language-specific tasks. Several other studies utilize tabular data, but their problem settings are outside of our scope [3,21,31,32,35].…”

Section: Related Workmentioning

confidence: 99%

SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

Somepalli,

Goldblum,

Schwarzschild

et al. 2021

Preprint

View full text Add to dashboard Cite

Tabular data underpins numerous high-impact applications of machine learning from fraud detection to genomics and healthcare. Classical approaches to solving tabular problems, such as gradient boosting and random forests, are widely used by practitioners. However, recent deep learning methods have achieved a degree of performance competitive with popular techniques. We devise a hybrid deep learning approach to solving tabular data problems. Our method, SAINT, performs attention over both rows and columns, and it includes an enhanced embedding method. We also study a new contrastive self-supervised pre-training method for use when labels are scarce. SAINT consistently improves performance over previous deep learning methods, and it even outperforms gradient boosting methods, including XGBoost, CatBoost, and LightGBM, on average over a variety of benchmark tasks.Preprint. Under review.

show abstract

Section: Related Workmentioning

confidence: 99%

SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

Somepalli,

Goldblum,

Schwarzschild

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Additional embedding techniques and manipulated representations of the data (e.g. TaBERT [26], TabFormer [27]) can be incorporated as pre-processing steps.…”

Section: Performance Comparisonsmentioning

confidence: 99%

The GatedTabTransformer. An enhanced deep learning architecture for tabular modeling

Cholakov¹,

Kolev²

2022

Preprint

View full text Add to dashboard Cite

There is an increasing interest in the application of deep learning architectures to tabular data. One of the state-of-the-art solutions is TabTransformer which incorporates an attention mechanism to better track relationships between categorical features and then makes use of a standard MLP to output its final logits. In this paper we propose multiple modifications to the original TabTransformer performing better on binary classification tasks for three separate datasets with more than 1% AUROC gains. Inspired by gated MLP, linear projections are implemented in the MLP block and multiple activation functions are tested. We also evaluate the importance of specific hyper parameters during training.

show abstract

“…These methods can often be made to accommodate categorical data through one-hot encoding, but in the authors' experience quality of models in this family rapidly degrades as the fraction of the series' variables that are categorical increases. The field in which anomaly detection in categorical time series is most developed in is intrusion detection in network security and fraud detection [4,9]. The authors in [9] utilize a transformer architecture for fraud detection inspired by an analogy between finite sequences of discrete variables and words in the domain of Natural Language Processing (NLP).…”

Section: Related Workmentioning

confidence: 99%

“…The field in which anomaly detection in categorical time series is most developed in is intrusion detection in network security and fraud detection [4,9]. The authors in [9] utilize a transformer architecture for fraud detection inspired by an analogy between finite sequences of discrete variables and words in the domain of Natural Language Processing (NLP).…”

Section: Related Workmentioning

confidence: 99%

NLP Based Anomaly Detection for Categorical Time Series

Horak¹,

Chandrasekaran²,

Giovanni³

2022

Preprint

View full text Add to dashboard Cite

Identifying anomalies in large multi-dimensional time series is a crucial and difficult task across multiple domains. Few methods exist in the literature that address this task when some of the variables are categorical in nature. We formalize an analogy between categorical time series and classical Natural Language Processing and demonstrate the strength of this analogy for anomaly detection and root cause investigation by implementing and testing three different machine learning anomaly detection and root cause investigation models based upon it.

show abstract

Tabular Transformers for Modeling Multivariate Time Series

Cited by 53 publications

References 9 publications

SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

The GatedTabTransformer. An enhanced deep learning architecture for tabular modeling

NLP Based Anomaly Detection for Categorical Time Series

Contact Info

Product

Resources

About