Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429)
DOI: 10.1109/icip.2003.1246886
|View full text |Cite
|
Sign up to set email alerts
|

Intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video

Abstract: This paper presents intermodal collaboration: a strategy for semantic content analysis for broadcasted sports video. The broadcasted video can be viewed as a set of multimodal streams such as visual, auditory, text (closed caption) and graphics streams. Collaborative analysis for the multimodal streams is achieved based on temporal dependency between their streams, in order to improve the reliability and efficiency for semantic content analysis such as extracting highlight scenes from sports video and automati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
32
0

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 22 publications
(32 citation statements)
references
References 14 publications
0
32
0
Order By: Relevance
“…Event detection is approached either by developing feature-based event models [8], [12], [13], [22]- [24], [26], by searching for keywords in speech (e.g. commentator) [4] and closed captions [16], by using MPEG-7 metadata [11] or by involving several of the above-mentioned clues into inter-modal collaboration [3], [9], [21]. We see the main disadvantage of this approach in the need for numerous and reliable event models which should take into account not only all highlight-related events but also various realizations of these events and their coverage that may change from one broadcaster to another.…”
mentioning
confidence: 99%
“…Event detection is approached either by developing feature-based event models [8], [12], [13], [22]- [24], [26], by searching for keywords in speech (e.g. commentator) [4] and closed captions [16], by using MPEG-7 metadata [11] or by involving several of the above-mentioned clues into inter-modal collaboration [3], [9], [21]. We see the main disadvantage of this approach in the need for numerous and reliable event models which should take into account not only all highlight-related events but also various realizations of these events and their coverage that may change from one broadcaster to another.…”
mentioning
confidence: 99%
“…Some of the works mentioned above [12,18,20,24,25] , and also [7,15,[26][27][28][29] combined features from different modalities, and reported better concept inference as compared to using just single modality analysis.…”
Section: Related Workmentioning
confidence: 99%
“…For techniques relying on embedded or external text [6,8,10,26,27], the main issue is availability. In [6,26] for instance, CCs are not available in many countries, so their utilization, even though valuable, would be limited.…”
Section: Text-based Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Previous works have illustrated that inter-modal collaboration based on multi-modal streams (e.g. visual and text [32], audio and motion [33]) can improve the robustness of the system. Hence, we have also applied the available information from different modalities for our setup to create a multi-level, multi-modal system.…”
Section: System Overviewmentioning
confidence: 99%