2024
DOI: 10.1007/s10772-024-10140-6
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of wav2vec 2.0 models on three speech processing tasks

Marie Kunešová,
Zbyněk Zajíc,
Luboš Šmídl
et al.

Abstract: The current state-of-the-art for various speech processing problems is a sequence-to-sequence model based on a self-attention mechanism known as transformer. The widely used wav2vec 2.0 is a self-supervised transformer model pre-trained on large amounts of unlabeled speech and then fine-tuned for a specific task. The data used for training and fine-tuning, along with the size of the transformer model, play a crucial role in both of these training steps. The most commonly used wav2vec 2.0 models are trained on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 42 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?