Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-1465
|View full text |Cite
|
Sign up to set email alerts
|

Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw

Abstract: We present a number of low-resource approaches to the tasks of the Zero Resource Speech Challenge 2021. We build on the unsupervised representations of speech proposed by the organizers as a baseline, derived from CPC and clustered with the kmeans algorithm. We demonstrate that simple methods of refining those representations can narrow the gap, or even improve upon the solutions which use a high computational budget. The results lead to the conclusion that the CPC-derived representations are still too noisy f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…The challenge contains two tracks, the non-visuallygrounded track and the visually-grounded (VG) track. The non-VG track was introduced earlier, and submissions to this track studied the effect of filtering out speaker information (v. Niekerk et al 2021), denoising CPC (v. d. Oord, Li, andVinyals 2018) representations by using methods from information retrieval (Chorowski et al 2021), and combining CPC with deep clustering (Maekaku et al 2021). These ideas offered improved results on the phonetic, lexical and syntactic tasks.…”
Section: Related Workmentioning
confidence: 99%
“…The challenge contains two tracks, the non-visuallygrounded track and the visually-grounded (VG) track. The non-VG track was introduced earlier, and submissions to this track studied the effect of filtering out speaker information (v. Niekerk et al 2021), denoising CPC (v. d. Oord, Li, andVinyals 2018) representations by using methods from information retrieval (Chorowski et al 2021), and combining CPC with deep clustering (Maekaku et al 2021). These ideas offered improved results on the phonetic, lexical and syntactic tasks.…”
Section: Related Workmentioning
confidence: 99%
“…al. [45] use the speaker information in the training. It is note worthy that the proposed HUC approach is low GPU budget (150h) and does not use any speaker information.…”
Section: E Comparison With Other Benchmarks On Abx Taskmentioning
confidence: 99%