Proceedings of the 2012 ACM International Workshop on Audio and Multimedia Methods for Large-Scale Video Analysis 2012
DOI: 10.1145/2390214.2390220
|View full text |Cite
|
Sign up to set email alerts
|

Short user-generated videos classification using accompanied audio categories

Abstract: This paper investigates the classification of short user-genera ted videos (UGVs) using the accompanied audio data since short UGVs accounts for a great proportion of the Internet UGVs and many short UGVs are accompanied by singlecategory soundtracks. We define seven types of UGVs corresponding to seven audio categories respectively. We also investigate three modeling approaches for audio feature representation, namely, single Gaussian (1G), Gaussian mixture (GMM) and Bag-of-Audio-Word (BoAW) models. Then usin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
3
3
1

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…However, most of the video classification work attempts to classify an entire video clip into one of several genres, such as sports, news, cartoon, music. In general, the previous methods can be categorized into four types: text-based approaches [1,23], audio feature based approaches [5,15,17], visual feature based approaches [4,20,22], and those that used some combination of text, audio and visual features [3,4,8]. In fact, most authors incorporated audio and visual features into their approaches (we call it content-based approaches), and these approaches achieved good performance.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…However, most of the video classification work attempts to classify an entire video clip into one of several genres, such as sports, news, cartoon, music. In general, the previous methods can be categorized into four types: text-based approaches [1,23], audio feature based approaches [5,15,17], visual feature based approaches [4,20,22], and those that used some combination of text, audio and visual features [3,4,8]. In fact, most authors incorporated audio and visual features into their approaches (we call it content-based approaches), and these approaches achieved good performance.…”
Section: Related Workmentioning
confidence: 99%
“…Ways of using these features investigated in existed work include many of the standard classifiers because of their ubiquitous nature, such as KNN [8,22], Linear Discriminant Analysis (LDA) [8], SVM [3,4,5,8,16,21], C4.5 decision tree [4,9], GMM [11,12,17,20]. Moreover, some more complicated methods such as HMM [4,19,20] and neural networks [12] were also introduced to video genre classification.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Our initial acceptance rate of 62.5% reflects the current state of the field: A few teams in the world produce high-quality pioneering research that supports the foundations of multimedia. We are looking forward to the presentations of the ones selected by the program committee [1,2,3,4,5]. Furthermore, we are very honored to present Koichi Shinoda (Tokyo Institute of Technology, Japan) as our keynote speaker.…”
Section: Contributions Of This Workhopmentioning
confidence: 99%
“…As the proliferation of Web 2.0 applications, user-generated video (UGV) [1] is poised to inundate the Internet. Recent statistics show that, on the primary video sharing website, YouTube 1 , 48 hours of video are uploaded every minute by users, resulting in nearly 8 years of content uploaded every day.…”
Section: Introductionmentioning
confidence: 99%