Extracting semantic representations from word co-occurrence statistics: A computational study

Bullinaria, John A.; Levy, Joseph P.

doi:10.3758/bf03193020

Cited by 582 publications

(622 citation statements)

References 31 publications

Supporting

Mentioning

597

Contrasting

Unclassified

Order By: Relevance

“…Fyshe, Talukdar, Murphy, and Mitchell (2013) report peak performance with around 600 dimensions. Bullinaria and Levy (2007) report maximum performance around 1,000 dimensions. The authors of these two studies used PPMI instead of a raw co-occurrence frequency prior to dimension reduction and argued that PPMI may capture more semantic information in these higher dimensions than raw cooccurrence frequencies.…”

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

“…Additionally, in the HAL model multidimensional scaling was performed on a different set of semantic vectors from three different, superordinate, semantic categories, and the results indicated that the same vectors contain Bcategorical^semantic information. These and a number of other word similarity judgment tasks have remained common benchmarks for assessing the quality of semantic space models (Bullinaria & Levy, 2007;Fyshe et al, 2013;Rohde, Gonnerman, & Plaut, 2006). A number of similarity metrics have also been used for comparing vectors in semantic spaces; the Euclidean distance and the cosine similarity are among the most popular (see Rohde et al, 2006, for a detailed discussion).…”

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

“…Many factors are known to affect the quality of semantic space models such as the dimensionality of the semantic space (Bullinaria & Levy, 2007;Landauer & Dumais, 1997;Lund & Burgess, 1996), the method of dimensionality reduction (Bullinaria & Levy, 2012;Rohde, Gonnerman, & Plaut, 2006), the size of the window used to count co-occurrences (Fyshe et al, 2013), and factors such as whether or not to exclude function words (stop lists) and removing grammatical endings (stemming; Bullinaria & Levy, 2012). The proper number of dimensions is, for example, determined empirically in most cases, with many researchers reporting a number around 300 despite using different dimensionality reduction methods (Landauer & Dumais, 1997;Lund & Burgess, 1996).…”

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

“…For example, probabilistic LSA (Hoffman, 2001) and its fully Bayesian extension the Topic model (Griffiths, Steyvers, & Tenenbaum, 2007) can identify lexemes with multiple senses (Tomar et al, 2013) and generate semantic representations as probability distributions rather than points in a high-dimension space. Positive pointwise mutual information (PPMI) has been used in place of raw cooccurrence frequencies (Bullinaria & Levy, 2007). Zhao, Li, and Kohonen (2011) integrated these models into a selforganizing map framework, and Fyshe, Talukdar, Murphy, and Mitchell (2013) discussed how different types of constraints on what counts as a co-occurrence qualitatively affect semantic information.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning

Schloss

2016

Behav Res

View full text Add to dashboard Cite

There has been a recent boom in research relating semantic space computational models to fMRI data, in an effort to better understand how the brain represents semantic information. In the first study reported here, we expanded on a previous study to examine how different semantic space models and modeling parameters affect the abilities of these computational models to predict brain activation in a datadriven set of 500 selected voxels. The findings suggest that these computational models may contain distinct types of semantic information that relate to different brain areas in different ways. On the basis of these findings, in a second study we conducted an additional exploratory analysis of theoretically motivated brain regions in the language network. We demonstrated that data-driven computational models can be successfully integrated into theoretical frameworks to inform and test theories of semantic representation and processing. The findings from our work are discussed in light of future directions for neuroimaging and computational research.Keywords LSA . HAL . Semantic space models . Coarse semantic coding . fMRI Latent semantic analysis (LSA; Landauer & Dumais, 1997) and the hyperspace analogue to language (HAL; Lund & Burgess, 1996) are among the most influential computational models of word meaning. LSA and HAL, among other socalled Bsemantic space models^or Bdistributional semantic models,^use word co-occurrence frequencies as the basic building blocks for word meaning (see Jones, Willits, & Dennis, 2015, for a recent review). In these models, the cooccurrence frequencies of a word with all the other documents (as in LSA) or all other words with which the word occurs (as in HAL) are used to build the vector representation for that word, typically based on a very large-scale text corpus. The resulting representation of any target word is a highdimensional vector with each dimension denoting either a word (word-to-word matrix) or a document (word-to-document matrix). The raw vectors may consist of thousands or tens of thousands of dimensions and are usually very sparse. Dimension reduction methods are often used to reduce the number of dimensions in these models. These standard methods used by LSA and HAL have since been further developed or expanded. For example, probabilistic LSA (Hoffman, 2001) and its fully Bayesian extension the Topic model (Griffiths, Steyvers, & Tenenbaum, 2007) can identify lexemes with multiple senses (Tomar et al., 2013) and generate semantic representations as probability distributions rather than points in a high-dimension space. Positive pointwise mutual information (PPMI) has been used in place of raw cooccurrence frequencies (Bullinaria & Levy, 2007). Zhao, Li, and Kohonen (2011) integrated these models into a selforganizing map framework, and Fyshe, Talukdar, Murphy, and Mitchell (2013) discussed how different types of constraints on what counts as a co-occurrence qualitatively affect semantic information. Evaluation of semantic space modelsComputational models...

show abstract

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

Section: Evaluation Of Semantic Space Modelsmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning

Schloss

2016

Behav Res

View full text Add to dashboard Cite

show abstract

“…Concordances are probably the most simple scheme to examine contextual semantic effects, but leave semantic inferences entirely to the human observer. A more complex layer is reached with collocations which can be identified automatically via statistical word co-occurrence metrics (Manning and Schütze, 1999;Wermter and Hahn, 2006), two of which are incorporated in JESEME as well: Positive pointwise mutual information (PPMI), developed by Bullinaria and Levy (2007) as an improvement over the probability ratio of normal pointwise mutual information (PMI; Church and Hanks (1990)) and Pearson's χ 2 , commonly used for testing the association between categorical variables (e.g., POS tags) and considered to be more robust than PMI when facing sparse information (Manning and Schütze, 1999).…”

Section: Distributional Semanticsmentioning

confidence: 99%

Exploring Diachronic Lexical Semantics with JeSemE

Hellrich¹,

Hahn²

2017

Proceedings of ACL 2017, System Demonstrations

View full text Add to dashboard Cite

Recent advances in distributional semantics combined with the availability of large-scale diachronic corpora offer new research avenues for the Digital Humanities. JESEME, the Jena Semantic Explorer, renders assistance to a non-technical audience to investigate diachronic semantic topics. JESEME runs as a website with query options and interactive visualizations of results, as well as a REST API for access to the underlying diachronic data sets.

show abstract

Semantic Memory

Yee¹,

Jones

Masamune

2018

Stevens' Handbook of Experimental Psychology and Cognitive Neuroscience

View full text Add to dashboard Cite

How is it that we know what a dog and a tree are, or, for that matter, what knowledge is? Our semantic memory consists of knowledge about the world, including concepts, facts, and beliefs. This knowledge is essential for recognizing entities and objects, and for making inferences and predictions about the world. In essence, our semantic knowledge determines how we understand and interact with the world around us. In this chapter, we examine semantic memory from cognitive, sensorimotor, cognitive neuroscientific, and computational perspectives. We consider the cognitive and neural processes (and biases) that allow people to learn and represent concepts, and discuss how and where in the brain sensory and motor information may be integrated to allow for the perception of a coherent “concept.” We suggest that our understanding of semantic memory can be enriched by considering how semantic knowledge develops across the life span within individuals.

show abstract

Extracting semantic representations from word co-occurrence statistics: A computational study

Cited by 582 publications

References 31 publications

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning

Disentangling narrow and coarse semantic networks in the brain: The role of computational models of word meaning

Exploring Diachronic Lexical Semantics with JeSemE

Semantic Memory

Contact Info

Product

Resources

About