2003
DOI: 10.1117/12.514397
|View full text |Cite
|
Sign up to set email alerts
|

<title>Investigation on effectiveness of mid-level feature representation for semantic boundary detection in news video</title>

Abstract: In our past work, we have attempted to use a mid-level feature namely the state population histogram obtained from the Hidden Markov Model (HMM) of a general sound class, for speaker change detection so as to extract semantic boundaries in broadcast news. In this paper, we compare the performance of our previous approach with another approach based on video shot detection and speaker change detection using the Bayesian Information Criterion (BIC). Our experiments show that the latter approach performs signific… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2004
2004
2011
2011

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 10 publications
(3 reference statements)
0
1
0
Order By: Relevance
“…We use simple Gaussian Mixture Models (GMM's) for audio classification that use the MDCT coefficients from the AC-3 stream. We also carry out speaker change detection using the MDCT coefficients [4]. For news content for instance, knowing the speaker transitions and finding the principal speakers helps find the story boundaries (See Figure 2).…”
Section: Video Summarization With Audio and Video Descriptorsmentioning
confidence: 99%
“…We use simple Gaussian Mixture Models (GMM's) for audio classification that use the MDCT coefficients from the AC-3 stream. We also carry out speaker change detection using the MDCT coefficients [4]. For news content for instance, knowing the speaker transitions and finding the principal speakers helps find the story boundaries (See Figure 2).…”
Section: Video Summarization With Audio and Video Descriptorsmentioning
confidence: 99%