A visual world experiment examined the time course for pragmatic inferences derived from visual context and contrastive intonation contours. We used the construction It looks like an X pronounced with either (a) a H* pitch accent on the final noun and a low boundary tone, or (b) a contrastive L+H* pitch accent and a rising boundary tone, a contour that can support contrastive inference (e.g., It LOOKSL+H* like a zebra L-H%...(but it is not)). When the visual display contained a single related set of contrasting pictures (e.g. a zebra vs. a zebra-like animal), effects of LOOKSL+H* emerged prior to the processing of phonemic information from the target noun. The results indicate that the prosodic processing is incremental and guided by contextually-supported expectations. Additional analyses ruled out explanations based on context-independent heuristics that might substitute for online computation of contrast.
According to Grice’s (1975) Maxim of Quantity, rational talkers formulate their utterances to be as economical as possible while conveying all necessary information. Naturally produced referential expressions, however, often contain more or less information than what is predicted to be optimal given a rational speaker model. How do listeners cope with these variations in the linguistic input? We argue that listeners navigate the variability in referential resolution by calibrating their expectations for the amount of linguistic signal to be expended for a certain meaning and by doing so in a context- or a talker-specific manner. Focusing on talker-specificity, we present four experiments. We first establish that speakers will generalize information from a single pair of adjectives to unseen adjectives in a speaker-specific manner (Experiment 1). Initially focusing on exposure to underspecified utterances, Experiment 2 examines: (a) the dimension of generalization; (b) effects of the strength of the evidence (implicit or explicit); and (c) individual differences in dimensions of generalization. Experiments 3 and 4 ask parallel questions for exposure to over-specified utterances, where we predict more conservative generalization because, in spontaneous utterances, talkers are more likely to over-modify than under-modify.
Speech prosody, the rhythm and intonation in particular, plays an important role in communication of meaning. Rising vs. falling intonation contours signaling the speaker's indented communicative meanings (i.e., asking a question vs. making a statement) has been widely recognized as a primary example of such prosody. However, what appears to be a straightforward mapping between acoustic features of prosody and hypothesized meanings in fact presents a challenge to the human perceptual and computational mechanisms. Perceptible features of prosody vary across contexts (e.g., talkers), creating ubiquitous ambiguity in the mapping. Here, we first characterize the structured nature of the variability in intonational speech prosody used to signal a question vs. a statement in English. We then demonstrate that the listener can learn to adapt their expectations about the prosody-meaning mapping according to an inferred underlying structure of the environmental input. We argue that the rich and dynamic representations of the prosodic input allow listeners to infer the mapping that is most likely given characteristics of a current context.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.