Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.364
|View full text |Cite
|
Sign up to set email alerts
|

SPARTQA: A Textual Question Answering Benchmark for Spatial Reasoning

Abstract: This paper proposes a question-answering (QA) benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior work and is challenging for state-of-the-art language models (LM). We propose a distant supervision method to improve on this task. Specifically, we design grammar and reasoning rules to automatically generate a spatial description of visual scenes and corresponding QA pairs. Experiments show that further pretraining LMs on these automatical… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
31
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

2
6

Authors

Journals

citations
Cited by 27 publications
(32 citation statements)
references
References 65 publications
1
31
0
Order By: Relevance
“…Regarding the evaluation of the approach (Step 6), two observations can be made at the time of this publication: preliminary results show that 1) pre-training language models on specific spatial knowledge does not alter the general commonsense knowledge gained from pre-training on CSKG; 2) a slight improvement (+2%) is observed only for the SIQA dataset, which indicates that the knowledge extracted from Visual Genome is substantially misaligned with the spatial domain partition of the other benchmarks (a phenomenon also reported in [17,56]). Due to the early stage of this research work on vertical augmentation, a thorough error analysis cannot be reported here.…”
Section: Vertical Augmentation: Towards Exploring the Depth Of Common...supporting
confidence: 56%
See 1 more Smart Citation
“…Regarding the evaluation of the approach (Step 6), two observations can be made at the time of this publication: preliminary results show that 1) pre-training language models on specific spatial knowledge does not alter the general commonsense knowledge gained from pre-training on CSKG; 2) a slight improvement (+2%) is observed only for the SIQA dataset, which indicates that the knowledge extracted from Visual Genome is substantially misaligned with the spatial domain partition of the other benchmarks (a phenomenon also reported in [17,56]). Due to the early stage of this research work on vertical augmentation, a thorough error analysis cannot be reported here.…”
Section: Vertical Augmentation: Towards Exploring the Depth Of Common...supporting
confidence: 56%
“…Furthermore, we hypothesize that, by modifying Visual Genome through additional annotations based on formallycharacterized spatial relations, reasoning capabilities can be advanced further. In particular, as demonstrated by [56], it is possible to obtain high quality textual annotations of topological relations like connected, disconnected, overlap, in and touch 6 , and boost the performance of models in datasets like bAbI [15] (see section 1.2).…”
Section: Vertical Augmentation: Towards Exploring the Depth Of Common...mentioning
confidence: 99%
“…"all the necessary information in the text") or commonsense reasoning (i.e. "text needs to be combined with extra world knowledge") settings: 18 • spatial reasoning: bAbI [272], SpartQA [176], many VQA datasets [e.g. 117, see §3.…”
Section: 23mentioning
confidence: 99%
“…We have also included more showcases to solve sentiment analysis (Go et al, 2009) and email spam detection in our GitHub repository 7 . We will add models for procedural reasoning and spatial role labeling (Mirzaee et al, 2021) in future.…”
Section: Inference-only Examplementioning
confidence: 99%
“…In general, the integration of domain knowledge can be done 1) using pretrained models and transferring knowledge (Devlin et al, 2018;Mirzaee et al, 2021), 2) designing architectures that integrate knowledge expressed in knowledge bases (KB) and knowledge graphs (KG) in a way that the KB/KG context influences the learned representations (Yang and Mitchell, 2019;Sun et al, 2018), or 3) using the knowledge explicitly and logically as a set of constrains or preferences over the inputs or outputs (Li and Srikumar, 2019a;Nandwani et al, 2019b;Muralidhar et al, 2018;Stewart and Ermon, 2017). Our current library aims at facilitating the third approach.…”
Section: Introductionmentioning
confidence: 99%