2008
DOI: 10.1109/lsp.2007.911184
|View full text |Cite
|
Sign up to set email alerts
|

Analysis and Improvement of Speech/Music Classification for 3GPP2 SMV Based on GMM

Abstract: In this letter, a novel approach is proposed to improve the performance of speech/music classification for the selectable mode vocoder (SMV) of 3GPP2 using the Gaussian mixture model (GMM). An in-depth analysis of the features and classification method adopted in the conventional SMV is performed. Feature vectors applied to the GMM are then selected from the relevant parameters of the SMV for efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 7 publications
0
4
0
Order By: Relevance
“…Research results [7][8][9] show that proper features and in turn good performance can only be obtained based on numerous experiments. Open-loop coding mode selection is essentially one kind of pattern classification.…”
Section: A Semi-open-loop Coding Mode Selection Algorithm Based mentioning
confidence: 99%
“…Research results [7][8][9] show that proper features and in turn good performance can only be obtained based on numerous experiments. Open-loop coding mode selection is essentially one kind of pattern classification.…”
Section: A Semi-open-loop Coding Mode Selection Algorithm Based mentioning
confidence: 99%
“…Recently, further improvements in speech/music classification problems have been achieved by adopting several machine learning techniques, such as the support vector machine (SVM) [6,7], Gaussian mixture model (GMM) [8], and deep belief network (DBN) [9] for the selectable mode vocoder (SMV) codec. The enhanced voice services (EVS) speech/music classifier, which is known as the 3rd-generation partnership project (3GPP) standard speech codec for the voice-over-LTE (VoLTE) network, is also based on GMM, but its features were calculated either at a current frame or as a moving average between those in the current and the previous frames [10].…”
Section: Introductionmentioning
confidence: 99%
“…Energy, Entropy, RMS, Peak-to-Sidelobe ratio (PSR) from the Hilbert Envelope of the LP Residual, Normalized Autocorrelation Peak Strength (NAPS) of Zero frequency filtered signal [10], [6], [8], [7], [2], [25], [5], [29], [28] speech spectrogram. Whereas, individual notes of music have a specific onset instant, marked by a relatively large burst of energy that make its striation patterns discontinuous [21].…”
Section: Introductionmentioning
confidence: 99%