2012
DOI: 10.1007/s10462-012-9332-4
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal feature extraction and fusion for semantic mining of soccer video: a survey

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 33 publications
(10 citation statements)
references
References 77 publications
0
9
0
Order By: Relevance
“…Consequently, companies such as STATS (Chicago, US), Second Spectrum (Los Angeles, US), ChyronHego (New York, US), and Deltatre (Torino, Italy) have all provided video tracking systems to the market which allow positional data to be collected and used for live and post-match analysis, talent scouting, and media enhancement [3]. A comprehensive survey of the state-of-the-art video-based player tracking systems can be found in Manafifard, Ebadi [4], and a survey on football video analysis was provided by Oskouie, Alipour [5]. Worth mentioning in this context is that a variety of different video tracking systems exist, which can be broadly classified according to the number, arrangement, and specification of utilised cameras (single vs multiple, stationary vs dynamic, stereo vs monocular) [4].…”
Section: Introductionmentioning
confidence: 99%
“…Consequently, companies such as STATS (Chicago, US), Second Spectrum (Los Angeles, US), ChyronHego (New York, US), and Deltatre (Torino, Italy) have all provided video tracking systems to the market which allow positional data to be collected and used for live and post-match analysis, talent scouting, and media enhancement [3]. A comprehensive survey of the state-of-the-art video-based player tracking systems can be found in Manafifard, Ebadi [4], and a survey on football video analysis was provided by Oskouie, Alipour [5]. Worth mentioning in this context is that a variety of different video tracking systems exist, which can be broadly classified according to the number, arrangement, and specification of utilised cameras (single vs multiple, stationary vs dynamic, stereo vs monocular) [4].…”
Section: Introductionmentioning
confidence: 99%
“…To enhance the classifier performance, the misclassification error should be decreased during the learning process. So thus, the misclassification error should be minimized according to cost function in formula (7).…”
Section: Generalized Learning Vector Quantization (Glvq)mentioning
confidence: 99%
“…As the data has multiple peaks it needs custom techniques to analyze the data itself [2,3]. The study of multi-modal data analysis is conducted in the various area e.g., facial expression, sentiment analysis, medical image analysis, video analysis, and surveillance [4][5][6][7][8]. The other researches have been conducted in the special case of multi-modal classification such as large-scale and real-time multi-modal classification [9,10].There are several approaches proposed to solve classification in multi-modal data, such as using a more complex architecture classifier with kernel combination, incremental learning, and ensemble learning [11][12][13].…”
Section: Introductionmentioning
confidence: 99%
“…The survey by Oskouie et al [14] discusses the topic of multimodal feature extraction and fusion for semantic mining of soccer video, using both visual, audio, and text features. The main events of interest for a summary here are goal, penalty, booking, shot on target and offside situation.…”
Section: Related Workmentioning
confidence: 99%