SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data

Sun, Baoqing; Yang, Lin; Zhang, Wenhan; Lin, Michael; Dong, Patrick; Young, Charles; Dong, Jason

doi:10.1109/cvprw.2019.00360

Cited by 44 publications

(32 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Buturović et al designed a tabular-data-to-graphical mapping in which each feature vector is treated as a kernel, which is then applied to an arbitrary base image [17]. Sun et al experimented using pretrained production-level CNN models implementing a diametrically opposite approach consisting of projecting the literal value of the features graphically onto an image; for example, if a feature has a value of 0.2 for a given participant in the sample, the image would include the actual number 0.2 on it [18].…”

Section: Multimodal Codex Sequencementioning

confidence: 99%

Automatic Assessment of Emotion Dysregulation in American, French, and Tunisian Adults and New Developments in Deep Multimodal Fusion: Cross-sectional Study

Parra¹,

Benezeth²,

Yang³

2022

JMIR Ment Health

View full text Add to dashboard Cite

Background Emotion dysregulation is a key dimension of adult psychological functioning. There is an interest in developing a computer-based, multimodal, and automatic measure. Objective We wanted to train a deep multimodal fusion model to estimate emotion dysregulation in adults based on their responses to the Multimodal Developmental Profile, a computer-based psychometric test, using only a small training sample and without transfer learning. Methods Two hundred and forty-eight participants from 3 different countries took the Multimodal Developmental Profile test, which exposed them to 14 picture and music stimuli and asked them to express their feelings about them, while the software extracted the following features from the video and audio signals: facial expressions, linguistic and paralinguistic characteristics of speech, head movements, gaze direction, and heart rate variability derivatives. Participants also responded to the brief version of the Difficulties in Emotional Regulation Scale. We separated and averaged the feature signals that corresponded to the responses to each stimulus, building a structured data set. We transformed each person’s per-stimulus structured data into a multimodal codex, a grayscale image created by projecting each feature’s normalized intensity value onto a cartesian space, deriving each pixel’s position by applying the Uniform Manifold Approximation and Projection method. The codex sequence was then fed to 2 network types. First, 13 convolutional neural networks dealt with the spatial aspect of the problem, estimating emotion dysregulation by analyzing each of the codified responses. These convolutional estimations were then fed to a transformer network that decoded the temporal aspect of the problem, estimating emotional dysregulation based on the succession of responses. We introduce a Feature Map Average Pooling layer, which computes the mean of the convolved feature maps produced by our convolution layers, dramatically reducing the number of learnable weights and increasing regularization through an ensembling effect. We implemented 8-fold cross-validation to provide a good enough estimation of the generalization ability to unseen samples. Most of the experiments mentioned in this paper are easily replicable using the associated Google Colab system. Results We found an average Pearson correlation (r) of 0.55 (with an average P value of <.001) between ground truth emotion dysregulation and our system’s estimation of emotion dysregulation. An average mean absolute error of 0.16 and a mean concordance correlation coefficient of 0.54 were also found. Conclusions In psychometry, our results represent excellent evidence of convergence validity, suggesting that the Multimodal Developmental Profile could be used in conjunction with this methodology to provide a valid measure of emotion dysregulation in adults. Future studies should replicate our findings using a hold-out test sample. Our methodology could be implemented more generally to train deep neural networks where only small training samples are available.

show abstract

Section: Multimodal Codex Sequencementioning

confidence: 99%

Automatic Assessment of Emotion Dysregulation in American, French, and Tunisian Adults and New Developments in Deep Multimodal Fusion: Cross-sectional Study

Parra¹,

Benezeth²,

Yang³

2022

JMIR Ment Health

View full text Add to dashboard Cite

show abstract

“…We also utilized one of the most recently developed automated ML (Au-toML) [28] algorithms, the AutoGluon [29] Python library package, to find the best predictive ML classification models with our dataset. For DL, we employed two DL classification models proposed for tabular-formed dataset: SuperTML [30] and TabNet [31]. We also provide their backgrounds in Appendix A.2.…”

Section: And DL Algorithm Settingsmentioning

confidence: 99%

“…Although numerous developed AutoML packages exist, we utilized the latest and best performing AutoGluon [29] library package. • SuperTML: proposed by Sun et al [30], SuperTML suggested a new way to deal with classification problems using tabular data with deep neural networks by embedding each instance's features into a two-dimensional image. It then uses a pretrained convolutional neural network (CNN) [54], consisting of residual networks (ResNet) [2], to extract a representation of the images, after which fully connected layers (with two hidden layers) classify the input.…”

mentioning

confidence: 99%

Forecasting the Walking Assistance Rehabilitation Level of Stroke Patients Using Artificial Intelligence

et al. 2021

View full text Add to dashboard Cite

Cerebrovascular accidents (CVA) cause a range of impairments in coordination, such as a spectrum of walking impairments ranging from mild gait imbalance to complete loss of mobility. Patients with CVA need personalized approaches tailored to their degree of walking impairment for effective rehabilitation. This paper aims to evaluate the validity of using various machine learning (ML) and deep learning (DL) classification models (support vector machine, Decision Tree, Perceptron, Light Gradient Boosting Machine, AutoGluon, SuperTML, and TabNet) for automated classification of walking assistant devices for CVA patients. We reviewed a total of 383 CVA patients’ (1623 observations) prescription data for eight different walking assistant devices from five hospitals. Among the classification models, the advanced tree-based classification models (LightGBM and tree models in AutoGluon) achieved classification results of over 90% accuracy, recall, precision, and F1-score. In particular, AutoGluon not only presented the highest predictive performance (almost 92% in accuracy, recall, precision, and F1-score, and 86.8% in balanced accuracy) but also demonstrated that the classification performances of the tree-based models were higher than that of the other models on its leaderboard. Therefore, we believe that tree-based classification models have potential as practical diagnosis tools for medical rehabilitation.

show abstract

“…Since a feature's position in the table, contrary to pixels in an image, carries no meaning, CNNs are not applicable to tabular data out of the box. Works attempting this have shown underwhelming results: their performance is "no better than SOTA" [3] or XGboost [5].…”

Section: Introductionmentioning

confidence: 99%

Transfer Learning for Tabular Data

Joffe¹

2021

Preprint

View full text Add to dashboard Cite

Deep learning models for tabular data are restricted to a specific table format. Computer vision models, on the other hand, have a broader applicability; they work on all images and can learn universal features. This allows them to be trained on enormous corpora and have very wide transferability and applicability. Inspired by these properties, this work presents an architecture that aims to capture useful patterns across arbitrary tables. The model is trained on randomly sampled subsets of features from a table, processed by a convolutional network. This internal representation captures feature interactions that appear in the table. Experimental results show that the embeddings produced by this model are useful and transferable across many commonly used machine learning benchmarks datasets. Specifically, that using the embeddings produced by the network as additional features, improves the performance of a number of classifiers.

show abstract

SuperTML: Two-Dimensional Word Embedding for the Precognition on Structured Tabular Data

Cited by 44 publications

References 24 publications

Automatic Assessment of Emotion Dysregulation in American, French, and Tunisian Adults and New Developments in Deep Multimodal Fusion: Cross-sectional Study

Automatic Assessment of Emotion Dysregulation in American, French, and Tunisian Adults and New Developments in Deep Multimodal Fusion: Cross-sectional Study

Forecasting the Walking Assistance Rehabilitation Level of Stroke Patients Using Artificial Intelligence

Transfer Learning for Tabular Data

Contact Info

Product

Resources

About