2023
DOI: 10.1109/taslp.2023.3293046
|View full text |Cite
|
Sign up to set email alerts
|

LogiQA 2.0—An Improved Dataset for Logical Reasoning in Natural Language Understanding

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 66 publications
0
10
0
Order By: Relevance
“…One plausible explanation for this pattern could be that the summaries were also generated by the same model. As previously demonstrated, GPTbased models tend to favor their own generated text more than text generated by other models (Chiang and Liu et al, 2023a). Conversely, these models exhibit a high degree of sensitivity to the provided prompt and input data.…”
Section: Comparing Human and Gpt-4 As Evaluatorsmentioning
confidence: 87%
See 3 more Smart Citations
“…One plausible explanation for this pattern could be that the summaries were also generated by the same model. As previously demonstrated, GPTbased models tend to favor their own generated text more than text generated by other models (Chiang and Liu et al, 2023a). Conversely, these models exhibit a high degree of sensitivity to the provided prompt and input data.…”
Section: Comparing Human and Gpt-4 As Evaluatorsmentioning
confidence: 87%
“…This pattern indicated a low level of agreement with human annotators. Therefore, we could not reproduce the same results as Liu et al (2023a).…”
Section: Comparing Human and Gpt-4 As Evaluatorsmentioning
confidence: 92%
See 2 more Smart Citations
“…Logical Reasoning For logical reasoning data resources, they can be mainly divided into two categories: Natural Language Inference (Saha et al, 2020;Tian et al, 2021;Liu et al, 2021) and Multiple-Choice Reading Comprehension (Liu et al, 2020Liu et al, 2023a). So far, there have been studies that have conducted in-depth analyses of the performance of LLMs on these two types of tasks.…”
Section: Reasoning Evaluation Of Llmsmentioning
confidence: 99%