2022
DOI: 10.48550/arxiv.2204.02152
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022

Abstract: We present the UTokyo-SaruLab mean opinion score (MOS) prediction system submitted to VoiceMOS Challenge 2022. The challenge is to predict the MOS values of speech samples collected from previous Blizzard Challenges and Voice Conversion Challenges for two tracks: a main track for in-domain prediction and an out-of-domain (OOD) track for which there is less labeled data from different listening tests. Our system is based on ensemble learning of strong and weak learners. Strong learners incorporate several impro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 21 publications
0
9
0
Order By: Relevance
“…We also use data augmentation and multi-task training described 2.2. Finally, we adapted contrastive loss [4] to boost ranking performance.…”
Section: Improved Ssl Baselinementioning
confidence: 99%
See 4 more Smart Citations
“…We also use data augmentation and multi-task training described 2.2. Finally, we adapted contrastive loss [4] to boost ranking performance.…”
Section: Improved Ssl Baselinementioning
confidence: 99%
“…The MOS trainable metric mimics scores collected by human annotation studies which brings challenges in modeling score variance. Some listeners are more strict than others [2,4], and even a single listener adapts its judgments based on the quality of previously judged recordings.…”
Section: Explaining Noise In Mos Annotationsmentioning
confidence: 99%
See 3 more Smart Citations