“…With the popularity of social networks, a tremendous number of users routinely share a series of photos, along with their related comments/stories on social media platforms such as Instagram and Flickr. Consequently, a new task of visual storytelling [1,24,27,34,37,43,62,64,72], which aims at automatically generating a narrative story for an image stream (as shown in Figure 1), has recently attracted increasing attention in the multimedia community. Given a stream of images, humans are capable of composing a suitable story line and then generating a sequence of sentences.…”