2022
DOI: 10.1162/tacl_a_00482
|View full text |Cite
|
Sign up to set email alerts
|

Is My Model Using the Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

Abstract: Neural models command state-of-the-art performance across NLP tasks, including ones involving “reasoning”. Models claiming to reason about the evidence presented to them should attend to the correct parts of the input while avoiding spurious patterns therein, be self-consistent in their predictions across inputs, and be immune to biases derived from their pre-training in a nuanced, context- sensitive fashion. Do the prevalent *BERT- family of models do so? In this paper, we study this question using the proble… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 54 publications
0
4
0
Order By: Relevance
“…One potential reason is memorization. Previous works (Petroni et al, 2019;Carlini et al, 2021;Ishihara, 2023) show that large pretrained language models store knowledge in their parameters, which they tend to retrieve instead of reasoning over the provided input (Gupta et al, 2022a). Hence, models can memorize common arithmetic operations they encounter during training and perform well on certain downstream tasks.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…One potential reason is memorization. Previous works (Petroni et al, 2019;Carlini et al, 2021;Ishihara, 2023) show that large pretrained language models store knowledge in their parameters, which they tend to retrieve instead of reasoning over the provided input (Gupta et al, 2022a). Hence, models can memorize common arithmetic operations they encounter during training and perform well on certain downstream tasks.…”
Section: Resultsmentioning
confidence: 99%
“…question-answering) to NLI (Jena et al, 2022). Unlike unstructured text data, tables have a natural structure that allows creating controlled experiments more easily (Gupta et al, 2022a). We drew inspiration from prior tabular probing approaches and extended them for automating probing of numerical tabular data.…”
Section: Related Workmentioning
confidence: 99%
“…For example, for the edge going from Node 𝑒 1 .Name to Node 𝑒 2 .Name, we add "c1.Name <> c2.Name AND" -pending AND are removed at the end (line 17). The resulting query is obtained by concatenating the three clauses and returning it (lines [17][18]. Once a query is derived, its execution on 𝑐 gives a result table like the one in Table 2.…”
Section: Data Evidence Generationmentioning
confidence: 99%
“…In terms of reasoning requirements, about 80% of the examples in Totto [40] have sentences describing the data with text that does not contain mathematical expressions, such as max, min, and count, or comparison across values. (iii) They contain bias and errors that may lead to incorrect learning in the target models [18].…”
Section: Introductionmentioning
confidence: 99%