Techniques that use nonverbal behaviors to predict turn-changing situations—such as, in multiparty meetings, who the next speaker will be and when the next utterance will occur—have been receiving a lot of attention in recent research. To build a model for predicting these behaviors we conducted a research study to determine whether respiration could be effectively used as a basis for the prediction. Results of analyses of utterance and respiration data collected from participants in multiparty meetings reveal that the speaker takes a breath more quickly and deeply after the end of an utterance in turn-keeping than in turn-changing. They also indicate that the listener who will be the next speaker takes a bigger breath more quickly and deeply in turn-changing than the other listeners. On the basis of these results, we constructed and evaluated models for predicting the next speaker and the time of the next utterance in multiparty meetings. The results of the evaluation suggest that the characteristics of the speaker's inhalation right after an utterance unit—the points in time at which the inhalation starts and ends after the end of the utterance unit and the amplitude, slope, and duration of the inhalation phase—are effective for predicting the next speaker in multiparty meetings. They further suggest that the characteristics of listeners' inhalation—the points in time at which the inhalation starts and ends after the end of the utterance unit and the minimum and maximum inspiration, amplitude, and slope of the inhalation phase—are effective for predicting the next speaker. The start time and end time of the next speaker's inhalation are also useful for predicting the time of the next utterance in turn-changing.
To build a model for predicting the next speaker and the start time of the next utterance in multi-party meetings, we performed a fundamental study of how respiration could be effective for the prediction model. The results of the analysis reveal that a speaker inhales more rapidly and quickly right after the end of a unit of utterance in turn-keeping. The next speaker takes a bigger breath toward speaking in turn-changing than listeners who will not become the next speaker. Based on the results of the analysis, we constructed the prediction models to evaluate how effective the parameters are. The results of the evaluation suggest that the speaker's inhalation right after a unit of utterance, such as the start time from the end of the unit of utterance and the slope and duration of the inhalation phase, is effective for predicting whether turn-keeping or turn-changing happen about 350 ms before the start time of the next utterance on average and that listener's inhalation before the next utterance, such as the maximal inspiration and amplitude of the inhalation phase, is effective for predicting the next speaker in turn-changing about 900 ms before the start time of the next utterance on average.
In multiparty meetings, participants need to predict the end of the speaker’s utterance and who will start speaking next, as well as consider a strategy for good timing to speak next. Gaze behavior plays an important role in smooth turn-changing. This article proposes a prediction model that features three processing steps to predict (I) whether turn-changing or turn-keeping will occur, (II) who will be the next speaker in turn-changing, and (III) the timing of the start of the next speaker’s utterance. For the feature values of the model, we focused on gaze transition patterns and the timing structure of eye contact between a speaker and a listener near the end of the speaker’s utterance. Gaze transition patterns provide information about the order in which gaze behavior changes. The timing structure of eye contact is defined as who looks at whom and who looks away first, the speaker or listener, when eye contact between the speaker and a listener occurs. We collected corpus data of multiparty meetings, using the data to demonstrate relationships between gaze transition patterns and timing structure and situations (I), (II), and (III). The results of our analyses indicate that the gaze transition pattern of the speaker and listener and the timing structure of eye contact have a strong association with turn-changing, the next speaker in turn-changing, and the start time of the next utterance. On the basis of the results, we constructed prediction models using the gaze transition patterns and timing structure. The gaze transition patterns were found to be useful in predicting turn-changing, the next speaker in turn-changing, and the start time of the next utterance. Contrary to expectations, we did not find that the timing structure is useful for predicting the next speaker and the start time. This study opens up new possibilities for predicting the next speaker and the timing of the next utterance using gaze transition patterns in multiparty meetings.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.