Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1395
|View full text |Cite
|
Sign up to set email alerts
|

Towards an Unsupervised Entrainment Distance in Conversational Speech Using Deep Neural Networks

Abstract: Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics. Understanding how interlocutors tend to adapt to each other's speaking style through entrainment involves measuring a range of acoustic features and comparing those via multiple signal comparison methods. In this work, we present a turn-level distance measure obtained in an unsupervised manner using a Deep Neural Network (DNN) model, which we call Neural Entrainment Distance … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

3
5

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 17 publications
0
11
1
Order By: Relevance
“…Behavior quantification from speech Behavioral signal processing (BSP) Georgiou et al, 2011b) can play a central role in informing human assessment and decision making, especially in assisting domain specialists to observe, evaluate and identify domain-specific human behaviors exhibited over longer time scales. For example, in couples therapy (Black et al, 2013;Nasir et al, 2017b), depression (Gupta et al, 2014;Nasir et al, 2016;Stasak et al, 2016;Tanaka et al, 2017) and suicide risk assessment (Cummins et al, 2015;Venek et al, 2017;Nasir et al, 2018Nasir et al, , 2017a, behavior analysis systems help psychologists observe and evaluate domain-specific behaviors during interactions. Li et al (2016) proposed sparsely connected and disjointly trained deep neural networks to deal with the low-resource data issue in behavior understanding.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Behavior quantification from speech Behavioral signal processing (BSP) Georgiou et al, 2011b) can play a central role in informing human assessment and decision making, especially in assisting domain specialists to observe, evaluate and identify domain-specific human behaviors exhibited over longer time scales. For example, in couples therapy (Black et al, 2013;Nasir et al, 2017b), depression (Gupta et al, 2014;Nasir et al, 2016;Stasak et al, 2016;Tanaka et al, 2017) and suicide risk assessment (Cummins et al, 2015;Venek et al, 2017;Nasir et al, 2018Nasir et al, , 2017a, behavior analysis systems help psychologists observe and evaluate domain-specific behaviors during interactions. Li et al (2016) proposed sparsely connected and disjointly trained deep neural networks to deal with the low-resource data issue in behavior understanding.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, many other aspects of behavior, such as entrainment, turn-taking duration, pauses, non-verbal vocalizations, and influence between interlocutors, can be incorporated. Many such additional features can be similarly developed on different data and employed as primitives; for example entrainment measures can be trained through unlabeled data (Nasir et al, 2018). Furthermore, we expect that the results of behavior classification accuracy maybe be further improved through improved architectures, parameter tuning, and data engineering for each behavior of interest.…”
Section: Is the Contextual (Sequential) Information Important In Defimentioning
confidence: 99%
“…More recently, dynamic models to characterize the changes in behavior of couples during interactions have been proposed–both in acoustic [ 67 ] and lexical modalities [ 12 ], and extensions of the lexical work to produce more robust methods have been introduces within a neural-net framework [ 68 ]. Finally, some early results from our current work on prediction of marital outcome from acoustic features were presented in [ 69 ] with a simpler methodology and basic analyses. In the current work, we developed a improved framework that extracts both short-term and long-term temporal changes in acoustic features.…”
Section: Related Literaturementioning
confidence: 99%
“…When people engage in conversations in social settings, they tend to coordinate with each other and show similar behavior in various modalities. This tendency, known as entrainment or coordination, is exhibited through facial expressions [1], head-motion [2], vocal patterns (vocal entrainment) [3,4], as well as the use of language (linguistic coordination) [5]. Linguistic coordination is a well-established phenomenon in both spoken and written communication that has many collaborative benefits.…”
Section: Introductionmentioning
confidence: 99%