Proceedings of the Fourteenth Workshop on Semantic Evaluation 2020
DOI: 10.18653/v1/2020.semeval-1.55
|View full text |Cite
|
Sign up to set email alerts
|

CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights Based Counterfactual Detection

Abstract: In this paper, we describe an approach for modelling causal reasoning in natural language by detecting counterfactuals in text using multi-head self-attention weights. We use pre-trained transformer models to extract contextual embeddings and self-attention weights from the text. We show the use of convolutional layers to extract task-specific features from these self-attention weights. Further, we describe a fine-tuning approach with a common base model for knowledge sharing between the two closely related su… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 11 publications
0
2
0
Order By: Relevance
“…The following is the summary of the papers submitted for SemEval2020 conference task 5: Ding.et.al, Martin.et.al, Rajaswa.et.al used Transformer models like Bert for both of the subtasks. The output tensor of the above mentioned transformers is passed on to linear layer and the classification is processed either with a sigmoid or a logistic regression model.This approach gives a F1-score of 80% and 70% for the respective subtasks Ding et al (2020); Fajcik et al (2020); Patil and Baths (2020). Sung.et.al Sung et al (2020) uses multi stacked LSTM to get a performance of 80% in both the subtasks.…”
Section: Related Workmentioning
confidence: 99%
“…The following is the summary of the papers submitted for SemEval2020 conference task 5: Ding.et.al, Martin.et.al, Rajaswa.et.al used Transformer models like Bert for both of the subtasks. The output tensor of the above mentioned transformers is passed on to linear layer and the classification is processed either with a sigmoid or a logistic regression model.This approach gives a F1-score of 80% and 70% for the respective subtasks Ding et al (2020); Fajcik et al (2020); Patil and Baths (2020). Sung.et.al Sung et al (2020) uses multi stacked LSTM to get a performance of 80% in both the subtasks.…”
Section: Related Workmentioning
confidence: 99%
“…A unique approach for Subtask-2 is used by 7th placed team, rajaswa patil, where they use a base architecture for both subtasks. They first train with a binary-classification module for Subtask-1, then replace it with a regression-module and further fine-tune the system for Subtask-2 (Patil and Baths, 2020), leveraging the commonality between the two tasks.…”
Section: Subtask-2: Detecting Antecedent and Consequent (Dac)mentioning
confidence: 99%