Modeling of Pre-Trained Neural Network Embeddings Learned From Raw Waveform for COVID-19 Infection Detection

Mostaani, Zohreh; Prasad, RaviShankar; Vlasenko, Bogdan; Magimai-Doss, Mathew

doi:10.1109/icassp43922.2022.9746271

Cited by 4 publications

(1 citation statement)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper, we focus on the first track of the challenge, ExVo Multi-Task learning. Taking inspirations from recent works on the use of embeddings from the pre-trained networks for various speech procession tasks including paralinguistic tasks (Keesing et al, 2021;Yang et al, 2021;Mostaani et al, 2022;Srinivasan et al, 2022), we investigate the utility of neural embeddings for speakers' emotion intensity, native country and age estimation. In that regard, as illustrated in Figure 1, we compare two types of neural embedding extraction approaches: (a) neural embeddings extracted from neural networks trained in self-supervised learning (SSL) setting and (b) neural embeddings extracted from neural networks trained on auxiliary out-of-domain tasks such as, SER, phone classification and on in-domain ExVo challenge task.…”

Section: Introductionmentioning

confidence: 99%

Comparing supervised and self-supervised embedding for ExVo Multi-Task learning track

Purohit¹,

Mahmoud²,

Vlasenko³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The ICML Expressive Vocalizations (ExVo) Multi-task challenge 2022, focuses on understanding the emotional facets of the non-linguistic vocalizations (vocal bursts (VB)). The objective of this challenge is to predict emotional intensities for VB, being a multi-task challenge it also requires to predict speakers' age and native-country. For this challenge we study and compare two distinct embedding spaces namely, self-supervised learning (SSL) based embeddings and task-specific supervised learning based embeddings. Towards that, we investigate feature representations obtained from several pre-trained SSL neural networks and task-specific supervised classification neural networks. Our studies show that the best performance is obtained with a hybrid approach, where predictions derived via both SSL and task-specific supervised learning are used. Our best system on test-set surpasses the ComPARE baseline (harmonic mean of all sub-task scores i.e.,) by a relative 13% margin.

show abstract

Section: Introductionmentioning

confidence: 99%