2021
DOI: 10.1016/j.csl.2021.101204
|View full text |Cite
|
Sign up to set email alerts
|

Representation transfer learning from deep end-to-end speech recognition networks for the classification of health states from speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
8
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 22 publications
(7 citation statements)
references
References 48 publications
0
7
0
Order By: Relevance
“…This study, alongside most previous research, has focused on lexical elements of psychological therapy content (transcribed words), but it does not address the nonlexical, phonological features of talk (such as intonation and prosody) that can be an important predictor of health [ 66 ]. Therefore, future research should address the integration of lexical and phonological analyses of psychological therapy content for more accurate representations of in-session events.…”
Section: Discussionmentioning
confidence: 99%
“…This study, alongside most previous research, has focused on lexical elements of psychological therapy content (transcribed words), but it does not address the nonlexical, phonological features of talk (such as intonation and prosody) that can be an important predictor of health [ 66 ]. Therefore, future research should address the integration of lexical and phonological analyses of psychological therapy content for more accurate representations of in-session events.…”
Section: Discussionmentioning
confidence: 99%
“…e AM translates ATC speech into phonemebased text sequences that the PM later converts to wordbased order, i.e., the final objective of this study. In [15], a feature representation learning architecture has been proposed in this study. is method is encompassing the usage of combination of various extracted feature representations with Compact Bilinear Pooling (CBP), Automated Speech Recognition (ASR), DNN as feature extractor, and last inference through optimized RNN classifiers.…”
Section: Related Workmentioning
confidence: 99%
“…In the case of a limited size dataset, transfer learning is used to relax the hypothesis that the training data must be large, independent and identically distributed with the test data. This motivates many work [68], [69], [70], [71], [72], [73] to use transfer learning in the presence of insufficient training data for speech and language classification. A network, which is pretrained on a large-sized dataset, such as the ImageNet [74], will keep its structure and connection parameters, when used by a network-based deep transfer learning, to compute intermediate image representations for smaller-sized datasets.…”
Section: B Machine Learning Modelsmentioning
confidence: 99%