Guy Emerson scite author profile

Guy Emerson

5Publications

60Citation Statements Received

208Citation Statements Given

How they've been cited

How they cite others

213

205

Affiliations

University of Cambridge, Saarland University

Publications

Order By: Most citations

Words are Vectors, Dependencies are Matrices: Learning Word Embeddings from Dependency Graphs

Czarnowska¹,

Emerson²,

Copestake³

2019

View full text Add to dashboard Cite

Distributional Semantic Models (DSMs) construct vector representations of word meanings based on their contexts. Typically, the contexts of a word are defined as its closest neighbours, but they can also be retrieved from its syntactic dependency relations. In this work, we propose a new dependencybased DSM. The novelty of our model lies in associating an independent meaning representation, a matrix, with each dependency-label. This allows it to capture specifics of the relations between words and contexts, leading to good performance on both intrinsic and extrinsic evaluation tasks. In addition to that, our model has an inherent ability to represent dependency chains as products of matrices which provides a straightforward way of handling further contexts of a word.

show abstract

What are the Goals of Distributional Semantics?

Emerson

2020

View full text Add to dashboard Cite

Distributional semantic models have become a mainstay in NLP, providing useful features for downstream tasks. However, assessing long-term progress requires explicit long-term goals. In this paper, I take a broad linguistic perspective, looking at how well current models can deal with various semantic challenges. Given stark differences between models proposed in different subfields, a broad perspective is needed to see how we could integrate them. I conclude that, while linguistic insights can guide the design of model architectures, future progress will require balancing the often conflicting demands of linguistic expressiveness and computational tractability.

show abstract

SentiMerge: Combining Sentiment Lexicons in a Bayesian Framework

Emerson¹,

Declerck²

2014

View full text Add to dashboard Cite

Many approaches to sentiment analysis rely on a lexicon that labels words with a prior polarity. This is particularly true for languages other than English, where labelled training data is not easily available. Existing efforts to produce such lexicons exist, and to avoid duplicated effort, a principled way to combine multiple resources is required. In this paper, we introduce a Bayesian probabilistic model, which can simultaneously combine polarity scores from several data sources and estimate the quality of each source. We apply this algorithm to a set of four German sentiment lexicons, to produce the SentiMerge lexicon, which we make publically available. In a simple classification task, we show that this lexicon outperforms each of the underlying resources, as well as a majority vote model.

show abstract

Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting

Borgeaud

Emerson

2020

View full text Add to dashboard Cite

We propose a method for natural language generation, choosing the most representative output rather than the most likely output. By viewing the language generation process from the voting theory perspective, we define representativeness using range voting and a similarity measure. The proposed method can be applied when generating from any probabilistic language model, including n-gram models and neural network models. We evaluate different similarity measures on an image captioning task and a machine translation task, and show that our method generates longer and more diverse sentences, providing a solution to the common problem of short outputs being preferred over longer and more informative ones. The generated sentences obtain higher BLEU scores, particularly when the beam size is large. We also perform a human evaluation on both tasks and find that the outputs generated using our method are rated higher.

show abstract

Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models

Hautte¹,

Emerson

Rei

2019

View full text Add to dashboard Cite

Word embeddings are an essential component in a wide range of natural language processing applications. However, distributional semantic models are known to struggle when only a small number of context sentences are available. Several methods have been proposed to obtain higher-quality vectors for these words, leveraging both this context information and sometimes the word forms themselves through a hybrid approach. We show that the current tasks do not suffice to evaluate models that use word-form information, as such models can easily leverage word forms in the training data that are related to word forms in the test data. We introduce 3 new tasks, allowing for a more balanced comparison between models. Furthermore, we show that hyperparameters that have largely been ignored in previous work can consistently improve the performance of both baseline and advanced models, achieving a new state of the art on 4 out of 6 tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Guy Emerson

Words are Vectors, Dependencies are Matrices: Learning Word Embeddings from Dependency Graphs

What are the Goals of Distributional Semantics?

SentiMerge: Combining Sentiment Lexicons in a Bayesian Framework

Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting

Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models

Contact Info

Product

Resources

About