Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413880
|View full text |Cite
|
Sign up to set email alerts
|

Poet: Product-oriented Video Captioner for E-commerce

Abstract: In e-commerce, a growing number of user-generated videos are used for product promotion. How to generate video descriptions that narrate the user-preferred product characteristics depicted in the video is vital for successful promoting. Traditional video captioning methods, which focus on routinely describing what exists and happens in a video, are not amenable for product-oriented video captioning. To address this problem, we propose a product-oriented video captioner framework, abbreviated as Poet. Poet firs… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 27 publications
(11 citation statements)
references
References 37 publications
0
11
0
Order By: Relevance
“…Therefore, our study is valuable to video producers and instructors towards high-quality production and organization of educational video content. Additionally, the importance of sensemaking in video goes beyond learning and education, as videos have made significant inroads into online domains such as digital marketing [3], e-commerce [41]. The findings of this study can be beneficial to businesses and organizations interested in the sensemaking of videos on social media.…”
Section: Implications For Practicementioning
confidence: 94%
See 1 more Smart Citation
“…Therefore, our study is valuable to video producers and instructors towards high-quality production and organization of educational video content. Additionally, the importance of sensemaking in video goes beyond learning and education, as videos have made significant inroads into online domains such as digital marketing [3], e-commerce [41]. The findings of this study can be beneficial to businesses and organizations interested in the sensemaking of videos on social media.…”
Section: Implications For Practicementioning
confidence: 94%
“…Sensemaking has diverse theoretical routes and has been explored in a wide variety of domains and disciplines. Four perspectives on sensemaking have become very influential: cost structure of sensemaking [30], the data/frame theory [17], individual sensemaking [6], and collective sensemaking [41].…”
Section: Sensemaking Theoriesmentioning
confidence: 99%
“…BFVD & FFVD , Large scale product-oriented video captioning datasets proposed by POET [61], for video captioning in the field of e-commerce. The videos having page views over 1,00,000 and a click-through rate of more than 5% are collected from mobile Taobao (a Chinese shopping website) and labeled as either buyer or fan-generated.…”
Section: P Bfvd and Ffvd (Buyer-generated Fashion Video Dataset And Fan-generated Fashion Video Dataset)mentioning
confidence: 99%
“…A considerable number of unique words in the datasets make it among the largest datasets for the task. Figure 6 represents the POET's [61] sample generation on FFVD test dataset in comparison to AA-Transformer [61] and AA-Recnet [61] models.…”
Section: P Bfvd and Ffvd (Buyer-generated Fashion Video Dataset And Fan-generated Fashion Video Dataset)mentioning
confidence: 99%
“…As such, most existing approaches for (single) image captioning [9,14,16,22,25,32,53,56,66,75] cannot be directly adopted to handle this task since they neglect the sequential dependencies over the image stream when composing a story with multiple isolated sentences rather than a holistic story line. In addition, the inter-image temporal gap and visual change in an image stream are often far greater than the inter-frame variations in a video, making video captioning methods [4,13,21,29,41,49,50,54,73,[76][77][78] perform unsatisfactorily for visual storytelling either.…”
Section: Introductionmentioning
confidence: 99%