2010
DOI: 10.1121/1.3493450
|View full text |Cite
|
Sign up to set email alerts
|

Human phoneme recognition depending on speech-intrinsic variability

Abstract: The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent reg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
34
2

Year Published

2011
2011
2022
2022

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(42 citation statements)
references
References 36 publications
6
34
2
Order By: Relevance
“…3dBifdifferent speakers within the same language are used. Meyer et al [10] for example, found an effect of 2.7 dB for speakers that varied in dialect and accent and an effect of speaking rate, effort or style amounting to 1.4 dB within the same speaker.Itturns out that in general, the telephone version has higher (i.e., worse)S RTst han the broadband version using headphones. This is probably due to aloss of speech information and unilateral presentation when using at elephone [4].…”
Section: Digit Triplets Testmentioning
confidence: 96%
“…3dBifdifferent speakers within the same language are used. Meyer et al [10] for example, found an effect of 2.7 dB for speakers that varied in dialect and accent and an effect of speaking rate, effort or style amounting to 1.4 dB within the same speaker.Itturns out that in general, the telephone version has higher (i.e., worse)S RTst han the broadband version using headphones. This is probably due to aloss of speech information and unilateral presentation when using at elephone [4].…”
Section: Digit Triplets Testmentioning
confidence: 96%
“…The binomial (or multinomial) model is appropriate for data in each cell of a confusion-count matrix, and statistical estimation of the unknown proportion parameter of the binomial distribution has been studied extensively, e.g., [17]- [21]. Phoneme confusions have been measured in a very large number of studies; see, e.g., recent reviews in [16], [22], [23].…”
Section: Introductionmentioning
confidence: 99%
“…Conventional statistical tests for the significance of an observed difference in PC or MI between test conditions must use the observed variations among individual results to estimate the reliability. Parametric test methods such as ANOVA have been applied, e.g., in [22], [25], [26]. When PC and MI results are close to their upper or lower limits, it is obviously questionable to assume that data follow a Gaussian distribution.…”
Section: Introductionmentioning
confidence: 99%
“…The OLdenburg LOgatome (OLLO) corpus version 2.0 [27], a large speech corpus freely available for research purposes that holds recordings of 150 logatomes uttered by 50 speakers of both sexes (25 women), was selected as the most appropriate corpus for this investigation's objectives. It consists of a set of 80 logatomes of the consonant-vowel-consonant (CVC) form and a set of 70 vowel-consonant-vowel (VCV) logatomes, where each of these 150 logatomes was uttered three times by 40 German and 10 French speakers in their normal speaking style.…”
Section: Speech Corpusmentioning
confidence: 99%