We evaluate the predictions of two theories of syntactic processing complexity, dependency locality theory (DLT) and surprisal, against the Dundee corpus, which contains the eye-tracking record of 10 participants reading 51,000 words of newspaper text. Our results show that DLT integration cost is not a significant predictor of reading times for arbitrary words in the corpus. However, DLT successfully predicts reading times for nouns and verbs. We also find evidence for integration cost effects at auxiliaries, not predicted by DLT. For surprisal, we demonstrate that an unlexicalized formulation of surprisal can predict reading times for arbitrary words in the corpus. Comparing DLT integration cost and surprisal, we find that the two measures are uncorrelated, which suggests that a complete theory will need to incorporate both aspects of processing complexity. We conclude that eye-tracking corpora, which provide reading time data for naturally occurring, contextualized sentences, can complement experimental evidence as a basis for theories of processing complexity.
Psycholinguistic research shows that key properties of the human sentence processor are incrementality, connectedness (partial structures contain no unattached nodes), and prediction (upcoming syntactic structure is anticipated). There is currently no broad-coverage parsing model with these properties, however. In this article, we present the first broad-coverage probabilistic parser for PLTAG, a variant of TAG that supports all three requirements. We train our parser on a TAG-transformed version of the Penn Treebank and show that it achieves performance comparable to existing TAG parsers that are incremental but not predictive. We also use our PLTAG model to predict human reading times, demonstrating a better fit on the Dundee eyetracking corpus than a standard surprisal model.
While it has long been known that the pupil reacts to cognitive load, pupil size has received little attention in cognitive research because of its long latency and the difficulty of separating effects of cognitive load from the light reflex or effects due to eye movements. A novel measure, the Index of Cognitive Activity (ICA), relates cognitive effort to the frequency of small rapid dilations of the pupil. We report here on a total of seven experiments which test whether the ICA reliably indexes linguistically induced cognitive load: three experiments in reading (a manipulation of grammatical gender match / mismatch, an experiment of semantic fit, and an experiment comparing locally ambiguous subject versus object relative clauses, all in German), three dual-task experiments with simultaneous driving and spoken language comprehension (using the same manipulations as in the single-task reading experiments), and a visual world experiment comparing the processing of causal versus concessive discourse markers. These experiments are the first to investigate the effect and time course of the ICA in language processing. All of our experiments support the idea that the ICA indexes linguistic processing difficulty. The effects of our linguistic manipulations on the ICA are consistent for reading and auditory presentation. Furthermore, our experiments show that the ICA allows for usage within a multi-task paradigm. Its robustness with respect to eye movements means that it is a valid measure of processing difficulty for usage within the visual world paradigm, which will allow researchers to assess both visual attention and processing difficulty at the same time, using an eye-tracker. We argue that the ICA is indicative of activity in the locus caeruleus area of the brain stem, which has recently also been linked to P600 effects observed in psycholinguistic EEG experiments.
In spoken dialog systems, information must be presented sequentially, making it difficult to quickly browse through a large number of options. Recent studies have shown that user satisfaction is negatively correlated with dialog duration, suggesting that systems should be designed to maximize the efficiency of the interactions. Analysis of the logs of 2,000 dialogs between users and nine different dialog systems reveals that a large percentage of the time is spent on the information presentation phase, thus there is potentially a large pay-off to be gained from optimizing information presentation in spoken dialog systems. This article proposes a method that improves the efficiency of coping with large numbers of diverse options by selecting options and then structuring them based on a model of the user's preferences. This enables the dialog system to automatically determine trade-offs between alternative options that are relevant to the user and present these trade-offs explicitly. Multiple attractive options are thereby structured such that the user can gradually refine her request to find the optimal trade-off. To evaluate and challenge our approach, we conducted a series of experiments that test the effectiveness of the proposed strategy. Experimental results show that basing the content structuring and content selection process on a user model increases the efficiency and effectiveness of the user's interaction. Users complete their tasks more successfully and more quickly. Furthermore, user surveys revealed that participants found that the user-model based system presents complex trade-offs understandably and increases overall user satisfaction. The experiments also indicate that presenting users with a brief overview of options that do not fit their requirements significantly improves the user's overview of available options, also making them feel more confident in having been presented with all relevant options.
Recent research in psycholinguistics has provided increasing evidence that humans predict upcoming content. Prediction also affects perception and might be a key to robustness in human language processing. In this paper, we investigate the factors that affect human prediction by building a computational model that can predict upcoming discourse referents based on linguistic knowledge alone vs. linguistic knowledge jointly with common-sense knowledge in the form of scripts. We find that script knowledge significantly improves model estimates of human predictions. In a second study, we test the highly controversial hypothesis that predictability influences referring expression type but do not find evidence for such an effect.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.