Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-519
|View full text |Cite
|
Sign up to set email alerts
|

Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 0 publications
0
8
0
Order By: Relevance
“…However, the failure modes of SSL models are still poorly understood, and it remains unclear whether they provide more or less robustness to adversarial attacks than fully supervised models. Due to the importance of this research direction, while writing this paper, there is already some related research about enhancing the robustness of SSL models [281], [352]- [354] and identifying their vulnerability to adversarial attack [351].…”
Section: Discussionmentioning
confidence: 99%
“…However, the failure modes of SSL models are still poorly understood, and it remains unclear whether they provide more or less robustness to adversarial attacks than fully supervised models. Due to the importance of this research direction, while writing this paper, there is already some related research about enhancing the robustness of SSL models [281], [352]- [354] and identifying their vulnerability to adversarial attack [351].…”
Section: Discussionmentioning
confidence: 99%
“…While larger versions of HuBERT have shown improved ASR accuracy in noisy conditions [17], for edge applications such as large footprint models, they can be problematic, and smaller models can be more sensitive to noise. To this end, the authors in [15] proposed the so-called Robust HuBERT method, where domain adversarial training was used to make the system more robust to environmental factors. More specifically, a domain discriminator is responsible for classifying the source of the distortions applied to the utterance.…”
Section: Making Hubert Robust To Noisementioning
confidence: 99%
“…Unseen noise and reverberation levels, for example, are known to drastically reduce the accuracy of even state-of-the-art ASR systems [12] and can be highly sensitive to different environmental conditions [13,14]. While domain adaptation techniques, such as those proposed in "Robust HuBERT" [15] and "deHuBERT" [16], can alleviate this problem, the methods are not directly applicable for compression. Recent works have started to propose solutions that tackle compression and environmental robustness jointly (e.g., [17,18]).…”
Section: Introductionmentioning
confidence: 99%
“…SPIN employs speaker invariant clustering to improve content representations. The term SSFT was proposed in [4] to distinguish fine-tuning methods using only audio [5,6] from supervised fine-tuning using labelled data [7].…”
Section: Introductionmentioning
confidence: 99%