2005
DOI: 10.1527/tjsai.20.220
|View full text |Cite
|
Sign up to set email alerts
|

Response Timing Detection Using Prosodic and Linguistic Information for Human-friendly Spoken Dialog Systems

Abstract: SummaryIf a dialog system can respond to the user as reasonably as a human, the interaction will become smoother. Timing of the response such as back-channels and turn-taking plays an important role in such a smooth dialog as in human-human interaction. We developed a response timing generator for such a dialog system. This generator uses a decision tree to detect the timing based on the features coming from some prosodic and linguistic information. The timing generator decides the action of the system at ever… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
35
0
1

Year Published

2009
2009
2021
2021

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(36 citation statements)
references
References 12 publications
0
35
0
1
Order By: Relevance
“…Kitaoka et al used first-order regression coefficients of pitch and power contours to describe patterns and generate response timing [29]. Nishimura et al pointed out that both the last short regions and the longer ones contained information which triggered backchannel responses [30].…”
Section: Prosodic Featuresmentioning
confidence: 99%
“…Kitaoka et al used first-order regression coefficients of pitch and power contours to describe patterns and generate response timing [29]. Nishimura et al pointed out that both the last short regions and the longer ones contained information which triggered backchannel responses [30].…”
Section: Prosodic Featuresmentioning
confidence: 99%
“…† † http://www.cs.waikato.ac.nz/ml/weka/ (14)- (18) We exploited five features that may be effective for our task by referring to previous work on turn-taking decision [10]- [12]. There might be other features that are effective, but exploring such features is among the future work.…”
Section: Featuresmentioning
confidence: 99%
“…Ohsuga et al identified prosodic features that are helpful for determining ends of turns with decision tree learning on the Japanese Map Task Corpus [11]. Kitaoka et al also used both prosodic and linguistic information to determine timing of system response generation [12]. Edlund et al developed a prosodic analysis tool to augment end-point detection [13].…”
Section: Related Workmentioning
confidence: 99%
“…Switching pauses [4], which are defined as pauses between turns, have been regarded as a distinctive property of spoken dialogue as a form of social interaction [5,6]. Unlike pauses in monologues or intrapersonal pauses in dialogues, the duration of switching pauses has an aspect similar to that of reaction time.…”
Section: Introductionmentioning
confidence: 99%
“…Although some previous studies [11,12] reported the effects of emotional state on the duration of intrautterance pauses, they did not deal with switching pauses. Furthermore, because the generation of response timing has been treated as an independent module from the speech synthesizer in most spoken dialogue systems [6,13], the finely tuned modeling of switching pause duration taking paralinguistic effects into account has specific importance in the design of responsive human interfaces.…”
Section: Introductionmentioning
confidence: 99%