Kenneth N. Stevens scite author profile

This book presents a theory of speech-sound generation in the human vocal system. This book presents a theory of speech-sound generation in the human vocal system. The comprehensive acoustic theory serves as one basis for defining categories of speech sounds used to form distinctions between words in languages. The author begins with a review of the anatomy and physiology of speech production, then covers source mechanisms, the vocal tract as an acoustic filter, relevant aspects of auditory psychophysics and physiology, and phonological representations. In the remaining chapters he presents a detailed examination of vowels, consonants, and the influence of context on speech-sound production. Although he focuses mainly on the sounds of English, he touches briefly on sounds in other languages. The book will serve as a reference for speech scientists, speech pathologists, linguists interested in phonetics and phonology, psychologists interested in speech perception and production, and engineers concerned with speech processing applications.

show abstract

Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability

Kalikow¹,

1977

View full text Add to dashboard Cite

This paper describes a test of everyday speech reception, in which a listener's utilization of the linguisticsituational information of speech is assessed, and is compared with the utilization of acoustic-phonetic information. The test items are sentences which are presented in babble-type noise, and the listener response is the final word in the sentence (the key word) which is always a monosyllabic noun. Two types of sentences are used: high-predictability items for which the key word is somewhat predictable from the context, and low-predictability items for which the final word cannot be predicted from the context. Both types are included in several 50-item forms of the test, which are balanced for intelligibility, key-word familiarity and predictability, phonetic content, and length. Performance of normally heating listeners for various signal-to-noise ratios shows significantly different functions for low-and high-predictability items. The potential applications of this test, particularly in the assessment of speech reception in the hearing impaired, are discussed. . PACS numbers: 43.70.Ep, 43.50.Qp, 43.70.Ve tute the word, Some classes of sounds are more susceptible to masking by noise than others, s and consequently words containing these sounds are likely to be less intelligible than words containing sounds that are resistant to masking, In developing any test of speech intelligibility, therefore, care must be taken to select speech materials in which the phonetic content is properly balanced to reflect the distribution of speechsound classes that occur in the language, The acoustic attributes of sentences include not only the properties of phonetic units, but also prosodic parameters that signal the characteristics of larger units within the sentences. These consist of variations in the durations of sounds and in the fundamental frequency of voiced sounds. There is evidence that these prosodic 1337 : Test of speech intelligibility in noise parameters are used by a listener as cues for the understanding of sentences, since. they contribute information about stress and the grouping of words. B. Effect of sentence context The fact that in a noisy environment words in a sentence context are more intelligible than words spoken in isolation or without the benefit of sentence context has been demonstrated by Miller, Heise, and Lichten, 6 and by Miller. • These investigators argued that the sentence context imposes constraints on the set of alternative words that are available as responses at a particular location in a sentence, and noted that the intelligibility of words increases when the number of response alternatives decreases. This conclusion was supported and quantified further by Duffy and Giolas, 8 who examined the intelligibility of words in sentences in which the words had various demade to introduce some variety in the syntactic structure of the PH and PM sentences, subject to the various constraints noted above. For 95 words, two PH sentences were constructed.Since there was no obvious way to decide...

show abstract

Emotions and Speech: Some Acoustical Correlates

1972

View full text Add to dashboard Cite

This paper describes some further attempts to identify and measure those parameters in the speech signal that reflect the emotional state of a speaker. High-quality recordings were obtained of professional "method" actors reading the dialogue of a short scenario specifically written to contain various emotional situations. Excerpted portions of the recordings were subjected to both quantitative and qualitative analyses. A comparison was also made of recordings from a real-life situation, in which the emotions of a speaker were clearly defined, with recordings from an actor who simulated the same situation. Anger, fear, and sorrow situations tended to produce characteristic differences in contour of fundamental frequency, average speech spectrum, temporal characteristics, precision of articulation, and waveform regularity of successive glottal pulses. Attributes for a given emotional situation were not always consistent from one speaker to another. SUBJECT CLASSIFICATION: 9.5, 9.3. would be convenient if an indication of this state could be obtained through analysis of the acoustic characteristics of his utterances. (2) Studies of speech attributes related to emotional state may help to contribute toward a general theory of speech performance. Such a theory should have two components: one that specifies the acoustic correlates of the linguistic units used for communication between speakers of a given language, and the other that describes the extralinguistic aspects of speech communication. I. APPROACHIn planning the study, two approaches were considered: (1) a detailed analysis of "field" recordings where there would be no question as to the emotion present in the individual speaking, and (2) an analysis of high-quality recordings of professional actors simulating various emotions. The second approach was selected for the major portion of our work since it seemed to afford the best opportunity for obtaining good recordings that could be subjected to both quantitative and qualitative analyses. Because actors are presumably able to portray clear and unambiguous emotions, their utterances provide a means for exploring the basic manifestations of emotional speech. Field recordings often reflect the simultaneous presence of several emotions and the lack of control of the speech material. While an approach using actors is not novel, 4-7 investigators employing it have never performed, to our knowledge, a spectrographic analysis of the recorded speech material.Since we believed that the emotions of interest might best be described in terms of specific situations involving emotional interaction among several people, the decision was made to make use of a short play. Getting the actors involved in clearly defined situations would, hopefully, result in their experiencing and expressing the various emotions to be studied. The primary function of the play was to elicit the desired emotions from the actors and to serve as the carrier for selected phrases and sentences to be embedded in different emotional situations. These ph...

show abstract

Invariant cues for place of articulation in stop consonants

1978

View full text Add to dashboard Cite

In a series of experiments, identification responses for place of articulation were obtained for synthetic stop consonants in consonant-vowel syllables with different vowels. The acoustic attributes of the consonants were systematically manipulated, the selection of stimulus characteristics being guided in part by theoretical considerations concerning the expected properties of the sound generated in the vocal tract as place of articulation is varied. Several stimulus series were generated with and without noise bursts at the onset, and with and without formant transitions following consonantal release. Stimuli with transitions only, and with bursts plus transitions, were consistently classified according to place of articulation, whereas stimuli with bursts only and no transitions were not consistently identified. The acoustic attributes of the stimuli were examined to determine whether invariant properties characterized each place of atriculation independent of vowel context. It was determined that the gross shape of the spectrum sampled at the consonantal release showed a distinctive shape for each place of articulation: a prominent midfrequency spectral peak for velars, a diffuse-rising spectrum for alveolars, and a diffuse-falling spectrum for labials. These attributes are evident for stimuli containing transitions only, but are enhanced by the presence of noise bursts at the onset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.