Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2022
DOI: 10.18653/v1/2022.naacl-main.181
|View full text |Cite
|
Sign up to set email alerts
|

Paragraph-based Transformer Pre-training for Multi-Sentence Inference

Abstract: Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks. We then propose a new pre-training objective that models the paragraph-leve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 25 publications
0
2
0
Order By: Relevance
“…Transformers can discern importance through a self-attention mechanism and achieve superior performance by gaining a deeper understanding of the context. , Furthermore, these models have facilitated agent-based modeling, aiding in the automation of problem-solving in the field of materials. These architectures require a pretraining process during which they learn general characteristics from extensive data sets. When fine-tuned for a specific task, they are known to perform better, even with limited data . They have been especially adapted for predicting properties of materials, ,− including polymer informatics, using Simplified Molecular Input Line Entry System (SMILES) representations.…”
Section: Introductionmentioning
confidence: 99%
“…Transformers can discern importance through a self-attention mechanism and achieve superior performance by gaining a deeper understanding of the context. , Furthermore, these models have facilitated agent-based modeling, aiding in the automation of problem-solving in the field of materials. These architectures require a pretraining process during which they learn general characteristics from extensive data sets. When fine-tuned for a specific task, they are known to perform better, even with limited data . They have been especially adapted for predicting properties of materials, ,− including polymer informatics, using Simplified Molecular Input Line Entry System (SMILES) representations.…”
Section: Introductionmentioning
confidence: 99%
“…Previous works (Zhong et al, 2019;Liello et al, 2022;Liu et al, 2020) and systems like FACTGPT 2 , typically formulates the fact verification as a classification task where the input consists of the evidence sentence(s) and the claim, and the output is a label indicating the veracity of the entire claim as SUPPORTED, REFUTED, or IRRELEVANT. As a concrete example, if the claim is "United States is in North America and has 51 states", then a sentence-level classification task would classify this claim as incorrect since there are 50 states in the United States.…”
Section: Introductionmentioning
confidence: 99%