2010
DOI: 10.1016/j.specom.2010.01.003
|View full text |Cite
|
Sign up to set email alerts
|

Using prosody to improve automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
17
0

Year Published

2011
2011
2017
2017

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 36 publications
(17 citation statements)
references
References 9 publications
0
17
0
Order By: Relevance
“…Since prosody provides essential discourse information that is available only from spoken language, there has been a significant amount of research towards its use for Automatic Speech Recognition (ASR) [5,6,7,8,9] and Spoken Language Understanding (SLU) [10,11,12,13] as well as its impact on ASR errors [14,15]. On a similar motivational basis as our work, Shriberg and Stolcke [13] use prosodic modelling to improve ASR and several subtasks of SLU.…”
Section: Introductionmentioning
confidence: 80%
“…Since prosody provides essential discourse information that is available only from spoken language, there has been a significant amount of research towards its use for Automatic Speech Recognition (ASR) [5,6,7,8,9] and Spoken Language Understanding (SLU) [10,11,12,13] as well as its impact on ASR errors [14,15]. On a similar motivational basis as our work, Shriberg and Stolcke [13] use prosodic modelling to improve ASR and several subtasks of SLU.…”
Section: Introductionmentioning
confidence: 80%
“…In [14], a HMM approach was proposed, further enhanced by [11], to automatically recover the PP structure of speech utterances. The algorithm involves a modelling step carried out by machine learning for the 7 different PP models in Hungarian for declarative modality (as presented in Table 1 [11]).…”
Section: Phonological Phrasingmentioning
confidence: 99%
“…Just like in an ASR system, backtracking is possible at intermittent points if a longer continuous speech stream is processed. Details of the approach, including acoustic feature extraction, training data, parameter settings and exhaustive evaluation for automatic phrasing, stress detection and word-boundary detection were presented in [11], hence the reader is referred to [11] and [14] for more information. Here we briefly mention that precision and recall of phrase boundaries was 0.89 for Hungarian on a read speech corpus (for the operation point characterized by equal precision and recall).…”
Section: Phonological Phrasingmentioning
confidence: 99%
“…Although, in the literature, we can find many articles regarding the automatic speech segmentation [1][2][3][4][5][6][7], the problems are not completely solved yet. The most common segmentation methods use LPC (Linear Predictive Coding), HMM (Hidden Markov Models), SVM (Support Vector Machine), the cepstrum based methods, and statistical methods.…”
Section: Introductionmentioning
confidence: 99%