2020
DOI: 10.1007/s10579-020-09517-1
|View full text |Cite
|
Sign up to set email alerts
|

AI2D-RST: a multimodal corpus of 1000 primary school science diagrams

Abstract: This article introduces AI2D-RST, a multimodal corpus of 1000 English-language diagrams that represent topics in primary school natural sciences, such as food webs, life cycles, moon phases and human physiology. The corpus is based on the Allen Institute for Artificial Intelligence Diagrams (AI2D) dataset, a collection of diagrams with crowdsourced descriptions, which was originally developed to support research on automatic diagram understanding and visual question answering. Building on the segmentation of d… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 50 publications
0
17
0
Order By: Relevance
“…This imbalance naturally sets limitations to what kinds of research questions may be pursued using the corpus (cf. also Hiippala et al, 2020). Adopting techniques proposed in digital humanities, such as rapid probing (Kuhn, 2019), in combination with guidance from multimodality theory, may eventually help to rethink the role and nature of multimodal corpora.…”
Section: Discussionmentioning
confidence: 87%
See 2 more Smart Citations
“…This imbalance naturally sets limitations to what kinds of research questions may be pursued using the corpus (cf. also Hiippala et al, 2020). Adopting techniques proposed in digital humanities, such as rapid probing (Kuhn, 2019), in combination with guidance from multimodality theory, may eventually help to rethink the role and nature of multimodal corpora.…”
Section: Discussionmentioning
confidence: 87%
“…While multimodality theory can provide appropriate metadata schemas for distant viewing, applying these schemas to data is time-consuming manual work, which is precisely the same logjam that has so far prevented building large multimodal corpora with multiple layers of annotation. In Hiippala et al (2020), we have recently shown that annotations created by crowd-sourced non-expert workers can be used to increase the size of multimodal corpora. However, the extent to which these annotations can support research on multimodality depends on how well the crowd-sourced annotations capture the characteristics of the modes and media under analysis.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Otto et al (2019) present an annotated dataset of text and imagery that compares the information load in text and images. However, we build on works that study information-level inferences between discourse units in different modalities such as comic book panels (McCloud, 1993), movie plots (Cumming et al, 2017), and diagrammatic elements (Hiippala et al, 2021). In particular, we use Alikhani et al (2020)'s relations that characterize inferences between text and images.…”
Section: Related Workmentioning
confidence: 99%
“…In this section, we introduce two interrelated diagram corpora, AI2D [17] and AI2D-RST [14], which build on one other, AI2D-RST covering a subset of AI2D.…”
Section: Multimodal Diagram Corporamentioning
confidence: 99%