End-to-end Spoken Language Understanding (SLU) systems, without speech-to-text conversion, are more promising in low resource scenarios. They can be more effective when there is not enough labeled data to train reliable speech recognition and language understanding systems, or where running SLU on edge is preferred over cloud based services. In this paper, we present an approach for bootstrapping end-to-end SLU in low resource scenarios. We show that incorporating layers extracted from pre-trained acoustic models, instead of using the typical Mel filter bank features, lead to better performing SLU models. Moreover, the layers extracted from a model pre-trained on one language perform well even for (a) SLU tasks on a different language and also (b) on utterances from speakers with speech disorder.
Major depressive disorder, referred to as depression, is a leading cause of disability, absence from work, and premature death. Automatic assessment of depression from speech is a critical step towards improving diagnosis and treatment of depression. Previous works on depression assessment from speech considered various acoustic features extracted from speech to estimate depression severity. But performance of these approaches is not at clinical standards, and thus requires further improvement. In this work, we examine two novel approaches for improving depression severity estimation from short audio recordings of speech. Specifically, in audio recordings of a narrative by individuals diagnosed with major depressive disorder, we analyze spectral-based and excitation source-based features extracted from speech, and significance of sentiment and emotion classification in estimation of depression severity. Initial results indicate synchrony between depression scores and the sentiment and emotion labels. We propose the use of sentiment and emotion based embeddings obtained using machine learning techniques in estimation of depression severity. We also propose use of multi-task training to better estimate depression severity. We show that the proposed approaches provide additive improvements in the estimation of depression severity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.