Eliminating Ditransitives

Kornai, András

doi:10.1007/978-3-642-32024-8_16

Cited by 7 publications

(10 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The 4lang theory of semantics was introduced and motivated in Kornai (2010) and Kornai (2012). The name refers to the initial concept dictionary, which had bindings in four languages, representative samples of the major language families spoken in Europe; Germanic (English), Slavic (Polish), Romance (Latin), and Finno-Ugric (Hungarian).…”

Section: Langmentioning

confidence: 99%

See 1 more Smart Citation

Measuring Semantic Similarity of Words Using Concept Networks

Recski¹,

Iklodi²,

Pajkossy³

et al. 2016

Proceedings of the 1st Workshop on Representation Learning for NLP

View full text Add to dashboard Cite

We present a state-of-the-art algorithm for measuring the semantic similarity of word pairs using novel combinations of word embeddings, WordNet, and the concept dictionary 4lang. We evaluate our system on the SimLex-999 benchmark data. Our top score of 0.76 is higher than any published system that we are aware of, well beyond the average inter-annotator agreement of 0.67, and close to the 0.78 average correlation between a human rater and the average of all other ratings, suggesting that our system has achieved nearhuman performance on this benchmark. IntroductionWe present a hybrid system for measuring the semantic similarity of word pairs. The system relies both on standard word embeddings, the WordNet database, and features derived from the 4lang concept dictionary, a set of concept graphs built from entries in monolingual dictionaries of English. 4lang-based features improve the performance of systems using only word embeddings and/or WordNet, our top configurations achieve state-of-the-art results on the SimLex-999 data, which has recently become a popular benchmark of word similarity metrics.In Section 1 we summarize earlier work on measuring word similarity and review the latest results achieved on the SimLex-999 data. Section 2 describes our experimental setup, Sections 2.1 and 2.2 documents the features obtained using word embeddings and WordNet. In Section 3 we briefly introduce the 4lang resources and the formalism it uses for encoding the meaning of words as directed graphs of concepts, then document our efforts to develop novel 4lang-based similarity features. Besides improving the performance of existing systems for measuring word similarity, the goal of the present project is to examine the potential of 4lang representations in representing non-trivial lexical relationships that are beyond the scope of word embeddings and standard linguistic ontologies.Section 4 presents our results and provides rough error analysis. Section 5 offers some conclusions and plans for future work. All software presented in this paper is available for download under an MIT license at

show abstract

Section: Langmentioning

confidence: 99%

“…Edges are of three types: 0, corresponding both to attribution and IS A relations; 1, corresponding to grammatical subjects; and 2, corresponding to grammatical objects. Indirect objects are handled by the decomposition methods pioneered in generative semantics, without recourse to a '3' link type (Kornai, 2012).…”

Section: Langmentioning

confidence: 99%

Measuring Semantic Similarity of Words Using Concept Networks

Recski¹,

Iklodi²,

Pajkossy³

et al. 2016

Proceedings of the 1st Workshop on Representation Learning for NLP

View full text Add to dashboard Cite

show abstract

Section: Theoretical Resultsmentioning

confidence: 92%

Proceedings of the 16th Meeting on the Mathematics of Language

Groote

Drewes

Penn

2019

View full text Add to dashboard Cite

show abstract

“…by systematic comparison of the learned p 1 and p 2 values with the observable proportion of intransitive and transitive verbs and relational nouns. Ditransitives are rare (in fact they usually make up less than 2% of the verbs) and we think these can be eliminated entirely (Kornai, 2012) of generality. The same kind of analysis could be attempted for other grammatical formalisms like type-logical grammars, which make tracking the open arguments an even more attractive proposition, but unfortunately these lack large parsed corpora.…”

Section: Discussionmentioning

confidence: 94%

Sentence Length

Borbély

Kornai

2019

Proceedings of the 16th Meeting on the Mathematics of Language

Self Cite

View full text Add to dashboard Cite

The distribution of sentence length in ordinary language is not well captured by the existing models. Here we survey previous models of sentence length and present our random walk model that offers both a better fit with the data and a better understanding of the distribution. We develop a generalization of KL divergence, discuss measuring the noise inherent in a corpus, and present a hyperparameter-free Bayesian model comparison method that has strong conceptual ties to Minimal Description Length modeling. The models we obtain require only a few dozen bits, orders of magnitude less than the naive nonparametric MDL models would.

show abstract

Eliminating Ditransitives

Cited by 7 publications

References 21 publications

Measuring Semantic Similarity of Words Using Concept Networks

Measuring Semantic Similarity of Words Using Concept Networks

Proceedings of the 16th Meeting on the Mathematics of Language

Sentence Length

Contact Info

Product

Resources

About