2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2011
DOI: 10.1109/icassp.2011.5947440
|View full text |Cite
|
Sign up to set email alerts
|

Detection of synthetic speech for the problem of imposture

Abstract: In this paper, we present new results from our research into the vulnerability of a speaker verification (SV) system to synthetic speech. We use a HMM-based speech synthesizer, which creates synthetic speech for a targeted speaker through adaptation of a background model and both GMM-UBM and support vector machine (SVM) SV systems. Using 283 speakers from the Wall-Street Journal (WSJ) corpus, our SV systems have a 0.35% EER. When the systems are tested with synthetic speech generated from speaker models derive… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
40
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 64 publications
(42 citation statements)
references
References 8 publications
2
40
0
Order By: Relevance
“…In consequence, the approaches to detect converted voice and synthesized speech reported in [15][16][17][18][19][20] will not detect speech converted according to the approach described above.…”
Section: Voice Conversionmentioning
confidence: 99%
See 3 more Smart Citations
“…In consequence, the approaches to detect converted voice and synthesized speech reported in [15][16][17][18][19][20] will not detect speech converted according to the approach described above.…”
Section: Voice Conversionmentioning
confidence: 99%
“…As such, converted speech retains real-speech phase information. Attacks of this nature will thus have the potential to overcome the countermeasures proposed in [15][16][17][18][19][20]. This paper reports a new countermeasure which exploits the reduction in pair-wise distances between consecutive feature vectors when they are both shifted towards the same local maxima of the likelihood function of a target speaker model as a consequence of voice conversion.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…For example, one group of such methods exploits the fact that many speech synthesis and voice conversion algorithms disturb the natural phase of the speech signal. In [9] the authors challenged GMM-UBM and SVM-GMM speaker verification systems with genuine and synthesized speech originating from the WSJ corpus. They showed that by using relative phase shift (RPS) features it was possible to decrease the EER from over 81 % to less than 3 %.…”
Section: Spoofing Countermeasures For Asv Systemsmentioning
confidence: 99%