2019
DOI: 10.3390/mti3040070
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation

Abstract: We investigated the mouth-opening transition pattern (MOTP), which represents the change of mouth-opening degree during the end of an utterance, and used it to predict the next speaker and utterance interval between the start time of the next speaker’s utterance and the end time of the current speaker’s utterance in a multi-party conversation. We first collected verbal and nonverbal data that include speech and the degree of mouth opening (closed, narrow-open, wide-open) of participants that were manually anno… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 42 publications
1
8
0
Order By: Relevance
“…They proposed a light SVM method that could be deployed to any agent equipped with a camera or a depth sensor. Similarly, in multi-party conversations, Ishii et al (2019) discovered that the speaker's and listeners' mouth-opening transition patterns could be used to predict the next speaker and the time interval between the current utterance ends and the next utterance begins. To prove it, they developed a three-step system.…”
Section: High-levelmentioning
confidence: 99%
“…They proposed a light SVM method that could be deployed to any agent equipped with a camera or a depth sensor. Similarly, in multi-party conversations, Ishii et al (2019) discovered that the speaker's and listeners' mouth-opening transition patterns could be used to predict the next speaker and the time interval between the current utterance ends and the next utterance begins. To prove it, they developed a three-step system.…”
Section: High-levelmentioning
confidence: 99%
“…Others have used visual features, such as overall physical motion (Chen and Harper, 2009 ; de Kok and Heylen, 2009 ; Dielmann et al, 2010 ; Roddy et al, 2018 ) near the end of a speaker's utterances or during multiple utterances. Moreover, some research has focused on detailed non-verbal behaviors, such as eye-gaze behavior (Chen and Harper, 2009 ; de Kok and Heylen, 2009 ; Huang et al, 2011 ; Jokinen et al, 2013 ; Ishii et al, 2015a , 2016a ), head movement (Huang et al, 2011 ; Ishii et al, 2015b , 2017 ), mouth movement (Ishii et al, 2019 ), and respiration (Ishii et al, 2015a , 2016b ). Specifically, information on the length of time and patterns of a speaker's gaze direction toward a listener during speaking, the amount of head movement, the patterns of the mouth opening and closing, and the amount of inspiratory volume are used as features for prediction.…”
Section: Related Workmentioning
confidence: 99%
“…However, many studies on turn-changing prediction use mainly features as mentioned above extracted from only speakers (Chen and Harper, 2009 ; de Kok and Heylen, 2009 ; Dielmann et al, 2010 ; Huang et al, 2011 ; Jokinen et al, 2013 ; Lala et al, 2018 ; Masumura et al, 2018 , 2019 ; Roddy et al, 2018 ). Several studies have used limited features and modalities of listeners, such as linguistic, eye-gaze behavior, head movement, mouse movement, and respiration as mentioned above (Ishii et al, 2015a , b , 2016a , b , 2017 , 2019 ; Masumura et al, 2018 ).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…With such knowledge, many studies have developed models for predicting actual turn-changing, i.e., whether turn-changing or turn-keeping will take place, on the basis of acoustic features [3, 6, 10, 12, 18, 26, 34, 36ś38, 43, 47, 50], linguistic features [34,37,38,43], and visual features, such as overall physical motion [3,6,8,43] near the end of a speaker's utterances or during multiple utterances. Moreover, some research has focused on detailed non-verbal behaviors such as eye-gaze behavior [3,6,18,20,24,26], head movement [18,21,22], mouth movement [23], and respiration [20,25]. However, many turn-changing prediction studies use mainly features extracted from speakers.…”
Section: Related Work 21 Turn-changing Prediction Technologymentioning
confidence: 99%