2017
DOI: 10.48550/arxiv.1710.10470
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Attention-Based Models for Text-Dependent Speaker Verification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(27 citation statements)
references
References 5 publications
0
27
0
Order By: Relevance
“…Attention mechanism for speaker verification has been investigated in recent papers. In [26], several methods were proposed for using attention in an LSTM-based text-dependent speaker verification. A slightly different strategy for adding attention to the x-vector topology was proposed in [27] while single and multi-head attentions were investigated for TI-SV.…”
Section: Using Two Types Of Attentionmentioning
confidence: 99%
See 1 more Smart Citation
“…Attention mechanism for speaker verification has been investigated in recent papers. In [26], several methods were proposed for using attention in an LSTM-based text-dependent speaker verification. A slightly different strategy for adding attention to the x-vector topology was proposed in [27] while single and multi-head attentions were investigated for TI-SV.…”
Section: Using Two Types Of Attentionmentioning
confidence: 99%
“…Here, we only consider single-head attention in two modes. The first one is the same as [27] while for the second one we doubled the size of last hidden layer before pooling and equally split its dimension into two parts like [26] and use the first part for calculating attention weights (i.e. keys) and the second part for calculating mean and standard deviation statistics (i.e.…”
Section: Using Two Types Of Attentionmentioning
confidence: 99%
“…Inspired by the application of attention mechanism in speech recognition [29], speaker verification [30] and single channel keyword spotting [31], following [17] we incorporate a soft self-attention for projecting K + 1 channels' fbank feature vectors to one channel, so that KWS still takes one channel input vector similarly as the baseline single channel model. For each time-step, we compute a K + 1 dimensional attention weight vector α for input fbank feature vectors z = [z1, z2, .…”
Section: Joint Training With Kws Modelmentioning
confidence: 99%
“…Only LSTM [11]and GRU [12] are used in our experiments for pair comparison. An attention mechanism [13] is applied to come up with an attention weight vector a = {a 1 , a 2 , a 3 , ..., a T }. Then C is the feature representation for the whole sequential input, which is computed as the weighted sum of h = {h 1 , h 2 , ..., h T }.…”
Section: The Baseline Modelmentioning
confidence: 99%