1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175)
DOI: 10.1109/mmsp.1998.738908
|View full text |Cite
|
Sign up to set email alerts
|

Classification TV programs based on audio information using hidden Markov model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
50
0
1

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 87 publications
(51 citation statements)
references
References 3 publications
0
50
0
1
Order By: Relevance
“…a common approach is to use Mel-Frequency Cepstral Coefficients (MFCC) [24] or to use time domain features, e.g. Root Mean Square of signal energy (RMS) [23] or Zero-Crossing Rate (ZCR) [25].…”
Section: Audio Descriptorsmentioning
confidence: 99%
See 1 more Smart Citation
“…a common approach is to use Mel-Frequency Cepstral Coefficients (MFCC) [24] or to use time domain features, e.g. Root Mean Square of signal energy (RMS) [23] or Zero-Crossing Rate (ZCR) [25].…”
Section: Audio Descriptorsmentioning
confidence: 99%
“…Audio-based information may be derived from, both, time and frequency domains. Usual time-domain approaches include the use of Root Mean Square of signal energy (RMS) [23], sub-band information [5], Zero-Crossing Rate (ZCR) [25] or silence ratio; while frequency-domain features include energy distribution, frequency centroid [25], bandwidth, pitch [6] or Mel-Frequency Cepstral Coefficients (MFCC) [24].…”
mentioning
confidence: 99%
“…There have been many systems proposed for specific video genres. Partition and classification of broadcast videos into meaningful sections have attracted significant attention [1][2][3][4][5][6][7][8]. In [7], Liu et al segment news reports from other categories based on both audio and visual information.…”
Section: Related Backgroundmentioning
confidence: 99%
“…To extract semantic meanings like genre from video, several works have been studied such as video genre or scene detection methods with audio or visual information [2,3]. Ba Tu Truong et al [2] showed about 83.1 % detection rate using C4.5 decision tree classifier and visual information from 60 second long video clips.…”
Section: Introductionmentioning
confidence: 99%
“…Ba Tu Truong et al [2] showed about 83.1 % detection rate using C4.5 decision tree classifier and visual information from 60 second long video clips. Zhu Liu et al [3] used audio information and 5-staste HMM with 28 symbols in their method. Their result described the accuracy of about 84.7 % with 10 minutes long audio clips.…”
Section: Introductionmentioning
confidence: 99%