Rationales for Sequential Predictions

Vafa, Keyon; Deng, Yuntian; Blei, David M.; Rushton, Gérard

doi:10.18653/v1/2021.emnlp-main.807

Cited by 13 publications

(26 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For this example, the transformer used for CAREER follows the architecture described in Radford et al (2018). We find the rationale using the greedy rationalization method described in Vafa et al (2021). Greedy rationalization requires fine-tuning the model for compatibility; we do this by fine-tuning with "job dropout", where with 50% probability, we drop out a uniformly random amount of observations in the history.…”

Section: E Experimental Detailsmentioning

confidence: 99%

“…To understand CAREER's prediction, we show the model's rationale, or the jobs in this individual's history that are sufficient for explaining the model's prediction. (We adapt the greedy rationalization method fromVafa et al (2021); refer to Appendix E for more details.) In this example, CAREER only needs three previous jobs to predict biological technician: animal caretaker, engineering technician, and student.…”

mentioning

confidence: 99%

See 1 more Smart Citation

CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data

Vafa¹,

Palikot²,

Du³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

Understanding career trajectories-the sequences of jobs that individuals hold over their working lives-is important to economists for studying labor markets. In the past, economists have estimated relevant quantities by fitting predictive models to small surveys, but in recent years large datasets of online resumes have also become available. These new datasets provide job sequences of many more individuals, but they are too large and complex for standard econometric modeling. To this end, we adapt ideas from modern language modeling to the analysis of large-scale job sequence data. We develop CAREER, a transformer-based model that learns a low-dimensional representation of an individual's job history. This representation can be used to predict jobs directly on a large dataset, or can be "transferred" to represent jobs in smaller and better-curated datasets. We fit the model to a large dataset of resumes, 24 million people who are involved in more than a thousand unique occupations. It forms accurate predictions on held-out data, and it learns useful career representations that can be fine-tuned to make accurate predictions on common economics datasets.

show abstract

Section: E Experimental Detailsmentioning

confidence: 99%

mentioning

confidence: 99%

CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data

Vafa¹,

Palikot²,

Du³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…We focus on extractive rationalization (Lei et al, 2016), which generates a subset of inputs or highlights as "rationales" such that the model can condition predictions on them. Recent development has been focusing on improving joint training of rationalizer and predictor components (Bastings et al, 2019;Yu et al, 2019;Paranjape et al, 2020;Guerreiro and Martins, 2021;Sha et al, 2021), or extensions to text matching (Swanson et al, 2020) and sequence generation (Vafa et al, 2021). These rationale models are mainly compared based on predictive performance, as well as agreement with human annotations (DeYoung et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

Can Rationalization Improve Robustness?

Chen¹,

Jacqueline²,

Narasimhan³

et al. 2022

Preprint

View full text Add to dashboard Cite

A growing line of work has investigated the development of neural NLP models that can produce rationales-subsets of input that can explain their model predictions. In this paper, we ask whether such rationale models can provide robustness to adversarial attacks in addition to their interpretable nature. Since these models need to first generate rationales ("rationalizer") before making predictions ("predictor"), they have the potential to ignore noise or adversarially added text by simply masking it out of the generated rationale. To this end, we systematically generate various types of 'AddText' attacks for both token and sentence-level rationalization tasks and perform an extensive empirical evaluation of state-of-the-art rationale models across five different tasks. Our experiments reveal that rationale models show promise in improving robustness but struggle in certain scenarios-e.g., when the rationalizer is sensitive to position bias or lexical choices of the attack text. Further, leveraging human rationales as supervision does not always translate to better performance. Our study is a first step towards exploring the interplay between interpretability and robustness in the rationalize-then-predict framework. 1

show abstract

“…Intuitively, our approach trains two classifiers: an explainability classifier (EC), which labels words in the textual context where the relation is expressed as important or not for the relation to be extracted, and a relation classifier (RC), which predicts the relation that holds between two given entities using only the words deemed as important. As such, our approach is self-explanatory because of inter-dependency between RC and EC, and generates faithful explanations that correctly depict how the relation classifier makes a decision (Vafa et al 2021).…”

Section: Introductionmentioning

confidence: 99%

“…In this situation, we measure the overlap between the words identified by the EC as important and the words used by rules using standard precision, recall, and F1 scores. The second strategy relies on plausability, i.e., can the machine explanations be understood and interpreted by humans (Wiegreffe and Pinter 2019a;Vafa et al 2021)? To this end, we compare the tokens identified by the EC against human annotations of the context words marked as important for the relation.…”

Section: Introductionmentioning

confidence: 99%

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Tang¹,

Surdeanu²

2022

Preprint

View full text Add to dashboard Cite

We propose an explainable approach for relation extraction that mitigates the tension between generalization and explainability by jointly training for the two goals. Our approach uses a multi-task learning architecture, which jointly trains a classifier for relation extraction, and a sequence model that labels words in the context of the relation that explain the decisions of the relation classifier. We also convert the model outputs to rules to bring global explanations to this approach. This sequence model is trained using a hybrid strategy: supervised, when supervision from pre-existing patterns is available, and semi-supervised otherwise. In the latter situation, we treat the sequence model's labels as latent variables, and learn the best assignment that maximizes the performance of the relation classifier. We evaluate the proposed approach on the two datasets and show that the sequence model provides labels that serve as accurate explanations for the relation classifier's decisions, and, importantly, that the joint training generally improves the performance of the relation classifier. We also evaluate the performance of the generated rules and show that the new rules are great add-on to the manual rules and bring the rule-based system much closer to the neural models.

show abstract

Rationales for Sequential Predictions

Cited by 13 publications

References 23 publications

CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data

CAREER: Transfer Learning for Economic Prediction of Labor Sequence Data

Can Rationalization Improve Robustness?

It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers

Contact Info

Product

Resources

About