2010 IEEE Fourth International Conference on Semantic Computing 2010
DOI: 10.1109/icsc.2010.23
|View full text |Cite
|
Sign up to set email alerts
|

On the Use of Visual Soft Semantics for Video Temporal Decomposition to Scenes

Abstract: Abstract-This work examines the possibility of exploiting, for the purpose of video segmentation to scenes, semantic information coming from the analysis of the visual modality. This information, in contrast to the low-level visual features typically used in previous approaches, is obtained by application of trained visual concept detectors such as those developed and evaluated as part of the TRECVID High-Level Feature Extraction Task. A large number of non-binary detectors is used for defining a high-dimensio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2011
2011
2018
2018

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 21 publications
0
8
0
Order By: Relevance
“…When working with low-level features, the squared Euclidean or the Euclidean distance, depending on the clustering algorithm, are typically used in the literature. However, for the model vectors we use an alternative distance measure which was introduced in [17], and was shown to be appropriate for confidence scores comparison. According to it, if C(Ii) and C(I k ) are two model vectors for the images Ii and I k respectively, the distance D of C(Ii) and C(I k ) is defined as:…”
Section: Visual Concepts For Clusteringmentioning
confidence: 99%
“…When working with low-level features, the squared Euclidean or the Euclidean distance, depending on the clustering algorithm, are typically used in the literature. However, for the model vectors we use an alternative distance measure which was introduced in [17], and was shown to be appropriate for confidence scores comparison. According to it, if C(Ii) and C(I k ) are two model vectors for the images Ii and I k respectively, the distance D of C(Ii) and C(I k ) is defined as:…”
Section: Visual Concepts For Clusteringmentioning
confidence: 99%
“…However, manual processing of large collections of video for extracting structural semantics is practically infeasible, and the state-of-the-art techniques for performing this task automatically generate results that still deviate considerably from perfection (e.g. [9], [10]). Therefore, it is by no means straightforward to say that video structural semantics extracted automatically by current stateof-the-art techniques are useful in interactive retrieval, nor is it of course possible to quantify their potential contribution without detailed experimentation.…”
Section: Retrievalmentioning
confidence: 99%
“…[10], [12], further exploit higher-level information such as visual concept and audio event detection results in order to come to a more accurate extraction of the videos' structural semantics. Specifically, in [10] the possibility of exploiting, for the purpose of video segmentation to scenes, semantic information coming from the analysis of the visual modality, was examined.…”
Section: Retrievalmentioning
confidence: 99%
See 2 more Smart Citations