2022
DOI: 10.48550/arxiv.2204.02166
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective

Yuying Xie,
Thomas Arildsen,
Zheng-Hua Tan

Abstract: Disentangled representation learning aims to extract explanatory features or factors and retain salient information. Factorized hierarchical variational autoencoder (FHVAE) presents a way to disentangle a speech signal into sequential-level and segmental-level features, which represent speaker identity and speech content information, respectively. As a selfsupervised objective, autoregressive predictive coding (APC), on the other hand, has been used in extracting meaningful and transferable speech features for… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 11 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?