2019
DOI: 10.1609/aaai.v33i01.33018909
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Photo-Scene Encoder for Album Storytelling

Abstract: In this paper, we propose a novel model with a hierarchical photo-scene encoder and a reconstructor for the task of album storytelling. The photo-scene encoder contains two subencoders, namely the photo and scene encoders, which are stacked together and behave hierarchically to fully exploit the structure information of the photos within an album. Specifically, the photo encoder generates semantic representation for each photo while exploiting temporal relationships among them. The scene encoder, relying on th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(18 citation statements)
references
References 27 publications
0
18
0
Order By: Relevance
“…Yu et al (2017) and Wang et al (2019) additionally tackle the problem of selecting photos from an ordered stream. Yu et al (2017) compute a photo's attention and pick photos with the highest probability of inclusion, while Wang et al (2019) introduce a scene encoder which can determine when the current photo describes the start of a new scene. Apart from this, Wang et al (2019) is an example of an end-to-end trainable encoder-decoder approach, using GRUs.…”
Section: Direct Deep Learning Without Intermediatesmentioning
confidence: 99%
See 1 more Smart Citation
“…Yu et al (2017) and Wang et al (2019) additionally tackle the problem of selecting photos from an ordered stream. Yu et al (2017) compute a photo's attention and pick photos with the highest probability of inclusion, while Wang et al (2019) introduce a scene encoder which can determine when the current photo describes the start of a new scene. Apart from this, Wang et al (2019) is an example of an end-to-end trainable encoder-decoder approach, using GRUs.…”
Section: Direct Deep Learning Without Intermediatesmentioning
confidence: 99%
“…2.3.1 which have no intermediate data like topic words, or scene graph. We have some further similarity with Wang et al (2019) because it uses attention on the photos. However, none of the above works separately models the text semantics at sentence-level also and word-level; this is something we are introducing to improve the coherence of the output story.…”
Section: Reinforcement Learningmentioning
confidence: 99%
“…HPSR (Wang et al 2019): HPSR is a model includes the hierarchical photo-scene encoder, decoder, and reconstructor.…”
Section: Models For Comparisonmentioning
confidence: 99%
“…Due to the bias can be brought by the hand-coded evaluation metrics, Wang et al (2018b) proposes an adversarial reward learning framework to uncover a reward function from human demonstrations. Wang et al (2019) propose a model with a hierarchical photo-scene encoder and a re-constructor. Huang et al (2019) develops a hierarchically reinforcement learning approach, which introduces a local semantic concept to model.…”
Section: Related Workmentioning
confidence: 99%
“…HPSR (Wang et al, 2019a): It introduces an additional RNN stacked on the RNN-based photo encoder to detect the scene change. Information from both RNNs are fed into an RNN for story generation.…”
Section: Models For Comparisonmentioning
confidence: 99%