2016
DOI: 10.48550/arxiv.1604.02748
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TGIF: A New Dataset and Benchmark on Animated GIF Description

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…For imagelanguage pre-training, there exist numerous large-scale and highquality datasets with over 100 thousand unique images, including English datasets [15,24] and Chinese datasets [10,23] (10 millions level). However, there are only three datasets including over 100 thousand unique videos for video-language pre-training, i.e., TGIF [22], AutoGIF [29], and HowTo100M [28]. All TGIF, AutoGIF, and HowTo100M are English datasets and crawled from open websites.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…For imagelanguage pre-training, there exist numerous large-scale and highquality datasets with over 100 thousand unique images, including English datasets [15,24] and Chinese datasets [10,23] (10 millions level). However, there are only three datasets including over 100 thousand unique videos for video-language pre-training, i.e., TGIF [22], AutoGIF [29], and HowTo100M [28]. All TGIF, AutoGIF, and HowTo100M are English datasets and crawled from open websites.…”
Section: Related Workmentioning
confidence: 99%
“…To empower the pre-trained model with Chinese knowledge and evaluate the effectiveness of the pre-training mechanism on Chinese data, we build a video-language pre-training dataset, namely Alivol-10M, from one of the world's largest e-commerce platforms. Existing video-language datasets are usually collected from open websites [6,22,22,28,43], where the data are freely created by Table 2: Results of cross-modal video retrieval tasks. R@N denotes the recall of top-N predictions.…”
Section: Pre-training Datasetmentioning
confidence: 99%