Alexander Clark scite author profile

The question of whether humans represent grammatical knowledge as a binary condition on membership in a set of well-formed sentences, or as a probabilistic property has been the subject of debate among linguists, psychologists, and cognitive scientists for many decades. Acceptability judgments present a serious problem for both classical binary and probabilistic theories of grammaticality. These judgements are gradient in nature, and so cannot be directly accommodated in a binary formal grammar. However, it is also not possible to simply reduce acceptability to probability. The acceptability of a sentence is not the same as the likelihood of its occurrence, which is, in part, determined by factors like sentence length and lexical frequency. In this paper, we present the results of a set of large-scale experiments using crowd-sourced acceptability judgments that demonstrate gradience to be a pervasive feature in acceptability judgments. We then show how one can predict acceptability judgments on the basis of probability by augmenting probabilistic language models with an acceptability measure. This is a function that normalizes probability values to eliminate the confounding factors of length and lexical frequency. We describe a sequence of modeling experiments with unsupervised language models drawn from state-of-the-art machine learning methods in natural language processing. Several of these models achieve very encouraging levels of accuracy in the acceptability prediction task, as measured by the correlation between the acceptability measure scores and mean human acceptability values. We consider the relevance of these results to the debate on the nature of grammatical competence, and we argue that they support the view that linguistic knowledge can be intrinsically probabilistic.

show abstract

Linguistic Nativism and the Poverty of the Stimulus

Clark¹,

Lappin²

2011

111

View full text Add to dashboard Cite

This highly readable but game-changing book shows to what extent the "poverty of the stimulus" argument stems from nothing more than poverty of the imagination. A must read for generative linguists.Ivan Sag, Stanford UniversityFor fifty years, the "poverty of the stimulus" has driven "nativist" linguistics. Clark and Lappin challenge the POS and develop a formal foundation for language learning. This brilliant book should be mandatory reading for anyone who wants to understand the most fundamental question in linguistics. Richard Sproat, Oregon Health and Science UniversityClark and Lappin provide a brilliant and wide-ranging re-examination of one of the most important questions in cognitive science: how much innate structure is required to support language acquisition. A remarkable achievement. Nick Chater, Professor of Behavioural Science, University of WarwickThis comprehensive cutting-edge treatise on linguistic nativism skillfully untangles the human capacity to effortlessly learn languages, from claims that this capacity is specific to language.

show abstract

Unsupervised induction of stochastic context-free grammars using distributional clustering

Clark

2001

106

View full text Add to dashboard Cite

An algorithm is presented for learning a phrase-structure grammar from tagged text. It clusters sequences of tags together based on local distributional information, and selects clusters that satisfy a novel mutual information criterion. This criterion is shown to be related to the entropy of a random variable associated with the tree structures, and it is demonstrated that it selects linguistically plausible constituents. This is incorporated in a Minimum Description Length algorithm. The evaluation of unsupervised models is discussed, and results are presented when the algorithm has been trained on 12 million words of the British National Corpus.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alexander Clark

Combining distributional and morphological information for part of speech induction

Identification in the Limit of Substitutable Context-Free Languages

Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge

Linguistic Nativism and the Poverty of the Stimulus

Unsupervised induction of stochastic context-free grammars using distributional clustering

Contact Info

Product

Resources

About