ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054423
|View full text |Cite
|
Sign up to set email alerts
|

HI-MIA: A Far-Field Text-Dependent Speaker Verification Database and the Baselines

Abstract: This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for far-field microphone array based speaker verification since most of the publicly available databases are single channel close-talking and text-independent. The database contains recordings of 340 people in rooms designed for the far-field scenario. Recordings are captured by multiple microphone arrays located in different directions and distance to the speaker and a high-fidelity c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
37
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 54 publications
(37 citation statements)
references
References 29 publications
0
37
0
Order By: Relevance
“…Furthermore, to validate the effect of MSEN on unseen channels, we design the generalized crosschannel (GCC) evaluation. The results on the HI-MIA corpus [18] demonstrate that MSEN drastically reduces the adverse impact of channel mismatch on recognition results, and significantly outperforms the state-of-the-art methods.…”
Section: Introductionmentioning
confidence: 94%
“…Furthermore, to validate the effect of MSEN on unseen channels, we design the generalized crosschannel (GCC) evaluation. The results on the HI-MIA corpus [18] demonstrate that MSEN drastically reduces the adverse impact of channel mismatch on recognition results, and significantly outperforms the state-of-the-art methods.…”
Section: Introductionmentioning
confidence: 94%
“…In this work, we use the AISHELL2 1 [15] as text-independent training data, AISHELL-wakeup 2 [11] as text-dependent training data, and AISHELL-2019B-eval dataset 3 [11] as text-dependent test set.…”
Section: Experimental Datamentioning
confidence: 99%
“…The AISHELL-2019B-eval contains recordings of 86 speakers with Chinese wake-up word "ni hao, mi ya".The room setting and recording devices are the same as that of AISHELL-wakeup. Utterances of the last 44 people are selected as the test set since they are more challenging [11]. This corpus has two tasks: close-talking enrollment task (utterances from the close-talking mic are used for enrollment) and far-field enrollment task (utterances from one 16-channel circular microphone array which is 1m away from the speaker are used for enrollment).…”
Section: Experimental Datamentioning
confidence: 99%
See 2 more Smart Citations