Speaker identification based on the frame linear predictive coding spectrum technique

Wu, Jian-Da; Lin, Bing-Fu

doi:10.1016/j.eswa.2008.10.051

Cited by 23 publications

(10 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Similar to MFCC, Linear Predictive Coding (LPC) is another method widely used in the literature of ASR [15,18,28,29,39,42,58]. Numerous researchers select LPC as their feature extraction method because it is one of the most powerful speech analysis techniques, and one of the most advantageous methods applied to encode good quality speech at low bit rates, while providing at the same time accurate estimates of the speech signal.…”

Section: Linear Predictive Codingmentioning

confidence: 99%

“…To extract the features from the sound previously produced, the input signal is divided into segments; again a Hamming window is used. Next, the LPC analysis is carried out by estimating the autocorrelation coefficients, as shown in the following expression [19,58]: Studies on infant cry classification. We differentiate the existing studies in terms of the type of cry to be recognized, the type of features extracted from the cries, and the classifier used for this aim.…”

Section: Linear Predictive Codingmentioning

confidence: 99%

“…LPC was proposed for the first time as a method for encoding human speech by the United States Department of Defense in the federal standard 1015, published in 1984. The proposed algorithm enabled the user to encode understandable speech, with a very unnatural and synthetic quality, and with a file size 20 times smaller than a MP3 file [58]. In digital signal processing, LPC can be viewed as a subset of filter theory [32].…”

Section: Linear Predictive Codingmentioning

confidence: 99%

See 2 more Smart Citations

Classifying infant cry patterns by the Genetic Selection of a Fuzzy Model

Rosales-Pérez

Reyes-García

González

et al. 2015

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

Section: Linear Predictive Codingmentioning

confidence: 99%

Section: Linear Predictive Codingmentioning

confidence: 99%

Section: Linear Predictive Codingmentioning

confidence: 99%

See 1 more Smart Citation

Classifying infant cry patterns by the Genetic Selection of a Fuzzy Model

Rosales-Pérez

Reyes-García

González

et al. 2015

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

“…Son numerosos los métodos de estimación de características aplicados al procesamiento de voz como el análisis cepstral [10], [11], el Linear Predictive Coding (LPC, por sus siglas en inglés) [12], [13], el MFCC [14], además de aproximaciones clásicas basadas en el análisis temporal de las señales y las representaciones en frecuencia de las mismas.…”

Section: Metodología Para La Extracción De Los Mfccunclassified

Reconocimiento de patrones de habla usando MFCC y RNA

Ramos¹,

Rojas²,

Gongora³

2016

Vis. Electron.

View full text Add to dashboard Cite

En este trabajo se presentan los resultados del diseño y desarrollo de un algoritmo basado en inteligencia artificial para el reconocimiento de patrones de vocablos del idioma español, utilizando Coeﬁcientes Cepstrales en las Frecuencias de Mel o (MFCC), para representar el habla a través de la percepción auditiva del ser humano. La utilización de MFCC permitió caracterizar las señales de voz teniendo en cuenta el posible ruido presente en el ambiente de grabación, lo cual ayudo a la obtención de patrones comunes entre estas señales cuando presentan alteraciones. Como resultado se obtuvo un reconocimiento superior al 95% de las tres vocales escogidas, en este caso la /a/,/e/,/o/, entre un grupo de 22 muestras por vocal para el entrenamiento y 11 muestras para la validación. Las muestras fueron obtenidas de 11 personas, todas del género masculino.

show abstract

“…Avci [35] investigated a feature extraction method for speaker recognition based on a combination of three entropy types (sure, logarithmic energy, and norm) was investigated. Daqrouq and Al Azzawi [28] and Wu and Lin [36] also proposed using DWT w instead of the Discrete Cosine Transform (DCT) to solve the problem of high frequency artifacts being introduced as a result of abrupt changes at window boundaries. The features based on DWT were chosen to evaluate the effectiveness of the selected feature for speaker identification [28], [37].…”

mentioning

confidence: 99%

Feature Reduction Method for Speaker Identification Systems Using Particle Swarm Optimization

Al-Hmouz¹,

Daqrouq²,

Al‐Hmouz³

et al. 2017

IJET

View full text Add to dashboard Cite

Feature selection (FS) is a process in which the most informative and descriptive characteristics of a signal that will lead to better classification are chosen. The process is utilized in many areas, such as machine learning, pattern recognition and signal processing. FS reduces the dimensionality of a signal and preserves the most informative features for further processing. A speech signal can consist of thousands of features. Feature extraction methods such as Average Framing Linear Prediction Coding (AFLPC) using wavelet transform reduce the number of features from thousands to hundreds. However, the vector of features involves some redundancy. In addition, some features are similar and do not give discrimination to classes. Taking such features into consideration in the classification process will not help to identify certain classes; conversely, they will only serve to confuse the classifier and inhibit identification of accurate classes. This paper proposes an FS method that uses evolution optimization techniques to select the most informative features that maximize the classification rates of Bayesian classifiers. The classification rate is also maximized by modeling the features with the proper number of Gaussian distributions. The results of comparative analysis conducted show that the selection based individual speaker model gives the best classification rate performance.Keyword -Feature Selection, Speaker Identification, Bayes Theorem. I. INTRODUCTIONResearch on automatic speech recognition (ASR) has actively been conductedover the past four decades [1]. ASR is a tool with many potential applications such as automation of operator-assisted services and speech to text systems for hearing-impaired individuals [2]. In speaker recognition systems, the speech signal is represented by several features, which play a major part in system design. Karhunen-Loeve transform (KLT) based features [3], Mel Frequency Cepstral Coefficient (MFCC) [4], Linear Predictive Cepstral Coefficient (LPCC) [5], and wavelet transform-based features [6]-[8] are examples of signal speech features.Various approaches have been proposed to reduce the number of features required for speech recognition. Paliwal[9] reduced the dimensionality of feature vectors in speech recognition systems and tested the technique on four methods. In [10], the Laplacian Eigenmaps Latent Variable Model (LEVLM) used fewer MFCC vectors without affecting the recognition rate, and it exhibited better performance than Principal Component Analysis (PCA). Feature frame selection based on phonetic information has also been investigated to increase classification rate; however, the exact phonemes cannot be easily extracted [11].Joint factor analysis (JFA) [12]- [14]is commonly used to enhance the performance of text independent speaker verification systems by modeling speaker and session variability. This work has been extended to ivector, which outperforms JFA in terms of complexity and model size [15]. The classification rate increaseswith the number of feature...

show abstract

Speaker identification based on the frame linear predictive coding spectrum technique

Cited by 23 publications

References 19 publications

Classifying infant cry patterns by the Genetic Selection of a Fuzzy Model

Classifying infant cry patterns by the Genetic Selection of a Fuzzy Model

Reconocimiento de patrones de habla usando MFCC y RNA

Feature Reduction Method for Speaker Identification Systems Using Particle Swarm Optimization

Contact Info

Product

Resources

About