2020
DOI: 10.1609/aaai.v34i05.6303
|View full text |Cite
|
Sign up to set email alerts
|

Knowledge-Enriched Visual Storytelling

Abstract: Stories are diverse and highly personalized, resulting in a large possible output space for story generation. Existing end-to-end approaches produce monotonous stories because they are limited to the vocabulary and knowledge in a single training dataset. This paper introduces KG-Story, a three-stage framework that allows the story generation model to take advantage of external Knowledge Graphs to produce interesting stories. KG-Story distills a set of representative words from the input prompts, enriches the w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 31 publications
(40 citation statements)
references
References 14 publications
0
40
0
Order By: Relevance
“…Leveraging External Resources for VIST Another set of work leverages external resources and knowledge to enrich the generated visual stories. For example, apply Concept-Net (Liu and Singh, 2004) and self-attention for create commonsense-augmented image features; Wang et al (2020) use graph convolution networks on scene graphs (Johnson et al, 2018) to associate objects across images; and KG-Story (Hsu et al, 2020) is a three-stage VIST framework that uses Visual Genome (Krishna et al, 2017) to produce knowledge-enriched visual stories.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Leveraging External Resources for VIST Another set of work leverages external resources and knowledge to enrich the generated visual stories. For example, apply Concept-Net (Liu and Singh, 2004) and self-attention for create commonsense-augmented image features; Wang et al (2020) use graph convolution networks on scene graphs (Johnson et al, 2018) to associate objects across images; and KG-Story (Hsu et al, 2020) is a three-stage VIST framework that uses Visual Genome (Krishna et al, 2017) to produce knowledge-enriched visual stories.…”
Section: Related Workmentioning
confidence: 99%
“…Terms These are story-like nouns such as events, time, and locations, which current object detection models are unable to extract. Therefore, we further use a Transformer-GRU (Hsu et al, 2020) to predict story-like terms. For each image and story pair, we use image objects as the input and the nouns in the corresponding human-written story as the ground truth.…”
Section: Story Element Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…So we adapt the few typical works to fit the few-shot setting for comparison. It is noted that, though [10,12] get higher score under the standard setting, we do not compare with them since they have used extra resources such as "Pretrained BERT" and "Knowledge Graph". The descriptions for these models are as follows:…”
Section: Comparison With Sotamentioning
confidence: 99%
“…Generating vocabulary from user's contextual data through Natural Language Generation (NLG) techniques seems an obvious venue to facilitate social interactions. Although NLG has been successfully applied in the context of task-oriented dialogs (He et al, 2017), question answering (Su et al, 2016), text summarization (See et al, 2017), and story generation from photograph sequences (Hsu et al, 2020), it is unclear how these techniques can be adapted to the specifc needs of AAC support (Tintarev et al, 2014).…”
Section: Introductionmentioning
confidence: 99%