2007
DOI: 10.1109/tasl.2006.881703
|View full text |Cite
|
Sign up to set email alerts
|

An Acoustic Measure for Word Prominence in Spontaneous Speech

Abstract: An algorithm for automatic speech prominence detection is reported in this paper. We describe a comparative analysis on various acoustic features for word prominence detection and report results using a spoken dialog corpus with manually assigned prominence labels. The focus is on features such as spectral intensity and speech rate that are directly extracted from speech based on a correlation-based approach without requiring explicit linguistic or phonetic knowledge. Additionally, various pitch-based measures… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
66
0

Year Published

2007
2007
2015
2015

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 66 publications
(67 citation statements)
references
References 24 publications
0
66
0
Order By: Relevance
“…decoding in speech recognition, and hence can be used for syntactic parsing [22], speech comprehension [23], and more recently, research is focusing on the audio-visual function of prosody, for example, in [24] it was found that visual prominence can enhance speech comprehension if coupled with the acoustic. Hence, it is important for speech driven avatars to detect prominent segments in the speech signal to drive gestures.…”
Section: Acoustic Detection Of Prominencementioning
confidence: 99%
“…decoding in speech recognition, and hence can be used for syntactic parsing [22], speech comprehension [23], and more recently, research is focusing on the audio-visual function of prosody, for example, in [24] it was found that visual prominence can enhance speech comprehension if coupled with the acoustic. Hence, it is important for speech driven avatars to detect prominent segments in the speech signal to drive gestures.…”
Section: Acoustic Detection Of Prominencementioning
confidence: 99%
“…Nonverbal information, such as the volume and speed of speech, gestures, pauses, facial expressions, etc., is an important factor to fully understand dialogs or communications [1,2,3]. Nonetheless, after speech signal is converted into text by a speech recognition system, only the language information remains and is utilized for comprehending the meaning of the speech.…”
Section: Introductionmentioning
confidence: 99%
“…Kim and Scassellati made use of prosodic features of English for robots to learn behavior [7]. Wang and Narayanan studied sentence boundary detection and key information extraction by measuring the speed of English utterances [3,8]. S.-A.…”
Section: Introductionmentioning
confidence: 99%
“…This idea was expanded on by Wang to include temporal correlation and a number of other improvements [3,4], and this is the method employed in this work. Wang used this detection method in [12] to create speaking rate and syllable length features for automatic speech prominence detection.…”
Section: Introductionmentioning
confidence: 99%