Evaluation campaigns and TRECVid

Smeaton, Alan F.; Over, Paul; Kraaij, Wessel

doi:10.1145/1178677.1178722

Cited by 963 publications

(756 citation statements)

References 8 publications

Supporting

Mentioning

719

Contrasting

Unclassified

Order By: Relevance

“…TRECVID [41] is an ongoing yearly competitive evaluation of methods for video indexing. TRECVID is an important evaluation for the field of video search as it coordinates a rigorous competitive evaluation and allows the community to gauge progress.…”

Section: Trecvidmentioning

confidence: 99%

“…The probabilistic model maintains efficiency by approximating the contributions of the majority of corpus video shots which are not found to be nearest neighbors to a query. Video search and mining research has traditionally involved known datasets with fixed sets of keywords and semantic concepts, such as TRECVID [41] and the Kodak benchmark dataset [26]. A key difference in our work is the absence of a constrained set of annotation keywords.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

YouTube Scale, Large Vocabulary Video Annotation

Morsillo

Mann

Pal

2010

Video Search and Mining

View full text Add to dashboard Cite

As video content on the web continues to expand, it is increasingly important to properly annotate videos for effective search and mining. While the idea of annotating static imagery with keywords is relatively well known, the idea of annotating videos with natural language keywords to enhance search is an important emerging problem with great potential to improve the quality of video search. However, leveraging web-scale video datasets for automated annotation also presents new challenges and requires methods specialized for scalability and efficiency. In this chapter we review specific, state of the art techniques for video analysis, feature extraction and classification suitable for extremely large scale automated video annotation. We also review key algorithms and data structures that make truly large scale video search possible. Drawing from these observations and insights, we present a complete method for automatically augmenting keyword annotations to videos using previous annotations for a large collection of videos. Our approach is designed explicitly to scale to YouTube sized datasets and we present some experiments and analysis for keyword augmentation quality using a corpus of over 1.2 million YouTube videos. We demonstrate how the automated annotation of webscale video collections is indeed feasible, and that an approach combining visual features with existing textual annotations yields better results than unimodal models.

show abstract

Section: Trecvidmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

YouTube Scale, Large Vocabulary Video Annotation

Morsillo

Mann

Pal

2010

Video Search and Mining

View full text Add to dashboard Cite

show abstract

“…The video material and the search topics used in these experiments are from the TRECVID evaluations [2] in 2006-2008. TRECVID is an annual workshop series organized by the National Institute of Standards and Technology (NIST), which provides the participating organizations large test collections, uniform scoring procedures, and a forum for comparing the results.…”

Section: Trecvidmentioning

confidence: 99%

“…This is mainly because such semantic concept detectors can be trained off-line with computationally more demanding algorithms and considerably more positive and negative examples than what are typically available at query time. In recent years, the TRECVID 1 [2] evaluations have emerged arguably as the leading venue for research on content-based video analysis and retrieval. TRECVID is an annual workshop series which encourages research in multimedia information retrieval by providing large test collections, uniform scoring procedures, and a forum for comparing results for participating organizations.…”

Section: Introductionmentioning

confidence: 99%

Improving Automatic Video Retrieval with Semantic Concept Detection

Koskela¹,

Sjöberg²,

Laaksonen³

2009

Image Analysis

View full text Add to dashboard Cite

Abstract. We study the usefulness of intermediate semantic concepts in bridging the semantic gap in automatic video retrieval. The results of a series of large-scale retrieval experiments, which combine text-based search, content-based retrieval, and concept-based retrieval, is presented. The experiments use the common video data and sets of queries from three successive TRECVID evaluations. By including concept detectors, we observe a consistent improvement on the search performance, despite the fact that the performance of the individual detectors is still often quite modest.

show abstract

“…In recent years, due to its great potential for many applications, the explosive growth of the user generated online videos and the prevailing online communities such as YouTube, Hulu etc., automatic detection of complex events in unconstrained videos has received a lot of interest from the research community [1] [2] [3]. However, most current tools only focus on single modality such as automatic transcription of speech from audio signal, scene recognition using color features or action detection based on time-related features.…”

Section: Introductionmentioning

confidence: 99%

Double Fusion for Multimedia Event Detection

Lan

Bao

et al. 2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Multimedia Event Detection is a multimedia retrieval task with the goal of finding videos of a particular event in an internet video archive, given example videos and descriptions. We focus here on mining features of example videos to learn the most characteristic features, which requires a combination of multiple complementary types of features. Generally, early fusion and late fusion are two popular combination strategies. The former one fuses features before performing classification and the latter one combines output of classifiers from different features. In this paper, we introduce a fusion scheme named double fusion, which combines early fusion and late fusion together to incorporate their advantages. Results are reported on TRECVID MED 2010 and 2011 data sets. For MED 2010, we get a mean minimal normalized detection cost (MNDC) of 0.49, which exceeds the state of the art performance by more than 12 percent.

show abstract

Evaluation campaigns and TRECVid

Cited by 963 publications

References 8 publications

YouTube Scale, Large Vocabulary Video Annotation

YouTube Scale, Large Vocabulary Video Annotation

Improving Automatic Video Retrieval with Semantic Concept Detection

Double Fusion for Multimedia Event Detection

Contact Info

Product

Resources

About