A bstractIn o rd e r to ex p lo re th e influence o f c o n te x t on th e p hon etic design o f talk-in-interaction, w e investigated th e pitch characteristics o f s h o rt tu m s (in se rtio n s) th a t a re produced by on e Speaker betw een turns from a n o th e r Speaker. W e investigated th e hypothesis th a t th e Speaker o f th e insertion designs h e r tu rn as a pitch match to th e p rio r turn in o rd e r to align w ith th e previous speaker's agenda, w h ereas non-matching displays th a t th e Speaker o f th e insertion is non-aligning, fo r exam ple to initiate a n e w action. D a ta w e re taken from th e A M I m eeting corpus, focusing on th e spontaneous talk o f first-language English participants. Using sequential analysis, 177 insertions w e r e classified as e ith e r aligning o r non-aligning in acco rd an ce w ith definitions o f these term s in th e C o n versa tio n Analysis literatu re. T h e degree o f sim ilarity betw een th e pitch c o n to u r o f th e insertion and th a t o f th e p rio r speaker's tu m w as m easured, using a n e w techn iqu e th a t integrates norm alized FO and in tensity inform ation. T h e results show ed th a t aligning insertions w e r e significantly m o re sim ilar to th e im m ediately preceding tu m , in term s o f pitch co n to u r, than w e r e non-aligning insertions. This Supports th e v ie w th a t cho ice o f pitch c o n to u r is managed locally, ra th e r than by re fe ren ce to an intonational lexicon.
Feedback utterances are among the most frequent in dialogue. Feedback is also a crucial aspect of linguistic theories that take social interaction, involving language, into account. This paper introduces the corpora and datasets of a project scrutinizing this kind of feedback utterances in French. We present the genesis of the corpora (for a total of about 16 hours of transcribed and phone force-aligned speech) involved in the project. We introduce the resulting datasets and discuss how they are being used in on-going work with focus on the form-function relationship of conversational feedback. All the corpora created and the datasets produced in the framework of this project will be made available for research purposes.
Precise multimodal studies require precise synchronisation between audio and video signals. However, raw audio and audio from video recordings can be out of sync for several reasons. In order to re-synchronise them, a dynamic programming (DP) approach is presented here. Traditionally, DP is performed on the rectangular distance matrix comparing each value in signal A with each value in signal B. Previous work limited the search space using for example the Sakoe Chiba Band (Sakoe and Chiba, 1978). However, the overall space of the distance matrix remains identical. Here, a tunnel matrix and its according DP-algorithm are presented. The matrix contains merely the computed distance of two signals to a pre-specified bandwidth and the computational cost is equally reduced. An example implementation demonstrates the functionality on artificial data and on data from real audio and video recordings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.