2024
DOI: 10.1109/taslp.2024.3389636
|View full text |Cite
|
Sign up to set email alerts
|

Masked Modeling Duo: Towards a Universal Audio Pre-Training Framework

Daisuke Niizumi,
Daiki Takeuchi,
Yasunori Ohishi
et al.

Abstract: Self-supervised learning (SSL) using masked prediction has made great strides in general-purpose audio representation. This study proposes Masked Modeling Duo (M2D), an improved masked prediction SSL, which learns by predicting representations of masked input signals that serve as training signals. Unlike conventional methods, M2D obtains a training signal by encoding only the masked part, encouraging the two networks in M2D to model the input. While M2D improves general-purpose audio representations, a specia… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
references
References 73 publications
0
0
0
Order By: Relevance