2019
DOI: 10.1016/j.cogsys.2018.09.028
|View full text |Cite
|
Sign up to set email alerts
|

Speaker identification using multi-modal i-vector approach for varying length speech in voice interactive systems

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 23 publications
0
6
0
Order By: Relevance
“…These parameters are usually adjusted repeatedly to decrease the cost function as the stochastic descent gradient decreases. Thus, for a given set of exercises, where x (i) is a given feature vector and y (i) corresponds to class (true), each hidden layer uses a nonlinear transformation function g to the result of the previous level [10]. This transformation takes into account the parameters W and b, which link one layer to the previous one, and provides the neuron activation values with the following ( 10), ( 11):…”
Section: Classifier System Deep Neural Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…These parameters are usually adjusted repeatedly to decrease the cost function as the stochastic descent gradient decreases. Thus, for a given set of exercises, where x (i) is a given feature vector and y (i) corresponds to class (true), each hidden layer uses a nonlinear transformation function g to the result of the previous level [10]. This transformation takes into account the parameters W and b, which link one layer to the previous one, and provides the neuron activation values with the following ( 10), ( 11):…”
Section: Classifier System Deep Neural Networkmentioning
confidence: 99%
“…In the problem of text-dependent speaker recognition, a priori knowledge of the text content is used, while the hidden Markov models are more accurate than the models of Gaussian mixtures. In [10], some advantages of using a combination of speech recognition on the syllabic hidden Markov model and speaker-oriented recognition based on the Gaussian mixture model are shown. The systems under consideration, based on traditional Gaussian mixture models (GMM), have achieved satisfactory results for speaker recognition only when the speech length is large enough.…”
Section: Introductionmentioning
confidence: 99%
“…(2) The elderly or users with speech defects may encounter a series of problems in the home environment. The special speech recognition system developed can help them solve speech disorders and ensure their safety and health in life [33][34][35][36][37]. (3) In order to satisfy the users in the long distance voice interaction under special environment requirements, the designers of the analysis of the intelligent building system in the process of long distance voice interaction in which may exist problems, put forward the speech recognition system for long distances, and broke the voice interaction possible distance limit, to improve the user experience in the intelligence environment [38][39][40][41][42][43][44].…”
Section: The Voice Interactionmentioning
confidence: 99%
“…Despite difficult acoustic settings, humans can extract speakerspecific speech qualities such as pitch, timbre, prosody and create latent speaker identity representations for recognition. With the aid of the state-of-the-art computational techniques, the notion of speaker identification has gained significant attention in the field of biometrics, access control systems that include voice dialing, interactive voice responses and others [2]. The efficiency of a typical speaker identification system solely depends on feature extraction and speaker modeling tasks [3].…”
Section: Introductionmentioning
confidence: 99%
“…The Gaussian mixture model-based approaches produce good results in speaker recognition task only when the speech lengths are sufficiently long. To tackle this problem, the authors in [2] proposed GMM based universal background model (GMM-UBM) to prepare multi-model i-vector speaker recognition system for short speech length.…”
Section: Introductionmentioning
confidence: 99%