Purpose Conversational entrainment, the phenomenon whereby communication partners synchronize their behavior, is considered essential for productive and fulfilling conversation. Lack of entrainment could, therefore, negatively impact conversational success. Although studied in many disciplines, entrainment has received limited attention in the field of speech-language pathology, where its implications may have direct clinical relevance. Method A novel computational methodology, informed by expert clinical assessment of conversation, was developed to investigate conversational entrainment across multiple speech dimensions in a corpus of experimentally elicited conversations involving healthy participants. The predictive relationship between the methodology output and an objective measure of conversational success, communicative efficiency, was then examined. Results Using a real versus sham validation procedure, we find evidence of sustained entrainment in rhythmic, articulatory, and phonatory dimensions of speech. We further validate the methodology, showing that models built on speech signal entrainment measures consistently outperform models built on nonentrained speech signal measures in predicting communicative efficiency of the conversations. Conclusions A multidimensional, clinically meaningful methodology for capturing conversational entrainment, validated in healthy populations, has implications for disciplines such as speech-language pathology where conversational entrainment represents a critical knowledge gap in the field, as well as a potential target for remediation.
In automatic speech processing systems, speaker diarization is a crucial front-end component to separate segments from different speakers. Inspired by the recent success of deep neural networks (DNNs) in semantic inferencing, triplet loss-based architectures have been successfully used for this problem. However, existing work utilizes conventional i-vectors as the input representation and builds simple fully connected networks for metric learning, thus not fully leveraging the modeling power of DNN architectures. This paper investigates the importance of learning effective representations from the sequences directly in metric learning pipelines for speaker diarization. More specifically, we propose to employ attention models to learn embeddings and the metric jointly in an end-to-end fashion. Experiments are conducted on the CALLHOME conversational speech corpus. The diarization results demonstrate that, besides providing a unified model, the proposed approach achieves improved performance when compared against existing approaches.
Acoustic-prosodic entrainment describes the tendency of humans to align or adapt their speech acoustics to each other in conversation. This alignment of spoken behavior has important implications for conversational success. However, modeling the subtle nature of entrainment in spoken dialogue continues to pose a challenge. In this paper, we propose a straightforward definition for local entrainment in the speech domain and operationalize an algorithm based on this: acoustic-prosodic features that capture entrainment should be maximally different between real conversations involving two partners and sham conversations generated by randomly mixing the speaking turns from the original two conversational partners. We propose an approach for measuring local entrainment that quantifies alignment of behavior on a turn-by-turn basis, projecting the differences between interlocutors' acoustic-prosodic features for a given turn onto a discriminative feature subspace that maximizes the difference between real and sham conversations. We evaluate the method using the derived features to drive a classifier aiming to predict an objective measure of conversational success (i.e., low versus high), on a corpus of task-oriented conversations. The proposed entrainment approach achieves 72% classification accuracy using a Naive Bayes classifier, outperforming three previously established approaches evaluated on the same conversational corpus.
Purpose Coordination of communicative behavior supports shared understanding in conversation. The current study brings together analysis of two speech coordination strategies, entrainment and compensation of articulation, in a preliminary investigation into whether strategy organization is shaped by a challenging communicative context—conversing with a person who has a communication disorder. Method As an initial clinical test case, an automated measure of articulatory precision was analyzed in a corpus of spoken dialogue, where a confederate conversed with participants with traumatic brain injury ( n = 28) and participants with no brain injury ( n = 48). Results Overall, the confederate engaged in significant entrainment and high compensation (hyperarticulation) in conversations with participants with traumatic brain injury relative to significant entrainment and low compensation (hypoarticulation) in conversations with participants with no brain injury. Furthermore, the confederate's articulatory precision changed over the course of the conversations. Conclusions Findings suggest that the organization of conversational coordination is sensitive to context, supporting synergistic models of spoken dialogue. While corpus limitations are acknowledged, these initial results point to differences in the way in which speech strategies are realized in challenging communicative contexts, highlighting a viable and important target for investigation with clinical populations. A framework for investigating speech coordination strategies in tandem and ideas for advancing this line of inquiry serve as key contributions of this work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.