2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI) 2013
DOI: 10.1109/cbmi.2013.6576545
|View full text |Cite
|
Sign up to set email alerts
|

An in-depth evaluation of multimodal video genre categorization

Abstract: Abstract-In this paper we propose an in-depth evaluation of the performance of video descriptors to multimodal video genre categorization. We discuss the perspective of designing appropriate late fusion techniques that would enable to attain very high categorization accuracy, close to the one achieved with user-based text information. Evaluation is carried out in the context of the 2012 Video Genre Tagging Task of the MediaEval Benchmarking Initiative for Multimedia Evaluation, using a data set of up to 15.000… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 19 publications
0
9
0
Order By: Relevance
“…The dataset is composed of 14,838 videos (3,288 hours) collected from the blip.tv 2 and is divided into a training set of 5,288 videos (36%) and a test set of 9,550 videos (64%). Those videos are distributed among 26 video genre categories assigned by the blip.tv media platform, namely (the numbers in brackets are the total number of videos): art (530), autos and vehicles (21) The main challenge of this collection is the high diversity of genres, as well as the high variety of visual contents within each genre category [14,15].…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…The dataset is composed of 14,838 videos (3,288 hours) collected from the blip.tv 2 and is divided into a training set of 5,288 videos (36%) and a test set of 9,550 videos (64%). Those videos are distributed among 26 video genre categories assigned by the blip.tv media platform, namely (the numbers in brackets are the total number of videos): art (530), autos and vehicles (21) The main challenge of this collection is the high diversity of genres, as well as the high variety of visual contents within each genre category [14,15].…”
Section: Experiments and Resultsmentioning
confidence: 99%
“…On Blip10000 dataset the highest performance is obtained using the proposed FK RBF run on standard audio descriptors, with an increase of MAP from 29.3% (without RF) to 46.3%; as well as run on all the combined descriptors which yields an increase from 30.2% (without RF) to 46.8%. On the other hand, the smallest increase in performance is obtained with BoVW descriptors, which also achieve low results during MediaEval 2012 Tagging Task benchmarking [59];…”
Section: Comparison With State-of-the-artmentioning
confidence: 96%
“…The second dataset is composed of 100 points corresponding to real video sequences from the Blip10000 [9] dataset. Each video is described via the standard audio features proposed in [17]. In this dataset, videos are classified in 3 equal sized categories: "Health", "Documentary" and "Literature".…”
Section: B Multi-class Experimentsmentioning
confidence: 99%