Michael Kayser scite author profile

Michael Kayser

5Publications

66Citation Statements Received

118Citation Statements Given

How they've been cited

135

How they cite others

115

Affiliations

Stanford University

Publications

Order By: Most citations

The Alexa Meaning Representation Language

Kollar¹,

Berry²,

Stuart³

et al. 2018

View full text Add to dashboard Cite

This paper introduces a meaning representation for spoken language understanding. The Alexa meaning representation language (AMRL), unlike previous approaches, which factor spoken utterances into domains, provides a common representation for how people communicate in spoken language. AMRL is a rooted graph, links to a large-scale ontology, supports cross-domain queries, finegrained types, complex utterances and composition. A spoken language dataset has been collected for Alexa, which contains ∼ 20k examples across eight domains. A version of this meaning representation was released to developers at a trade show in 2016.

show abstract

Deep Neural Language Models for Machine Translation

Luong

Kayser

Manning

2015

View full text Add to dashboard Cite

Neural language models (NLMs) have been able to improve machine translation (MT) thanks to their ability to generalize well to long contexts. Despite recent successes of deep neural networks in speech and vision, the general practice in MT is to incorporate NLMs with only one or two hidden layers and there have not been clear results on whether having more layers helps. In this paper, we demonstrate that deep NLMs with three or four layers outperform those with fewer layers in terms of both the perplexity and the translation quality. We combine various techniques to successfully train deep NLMs that jointly condition on both the source and target contexts. When reranking nbest lists of a strong web-forum baseline, our deep models yield an average boost of 0.5 TER / 0.5 BLEU points compared to using a shallow NLM. Additionally, we adapt our models to a new sms-chat domain and obtain a similar gain of 1.0 TER / 0.5 BLEU points. 1 Ashish Vaswani, Yinggong Zhao, Victoria Fossum, and David Chiang. 2013. Decoding with large-scale neural language models improves translation. In EMNLP.

show abstract

Methods for integrating rule-based and statistical systems for Arabic to English machine translation

Zbib

Kayser²,

Matsoukas³

et al. 2011

Machine Translation

View full text Add to dashboard Cite

This article presents several techniques for integrating information from a rule-based machine translation (RBMT) system into a statistical machine translation (SMT) framework. These techniques are grouped into three parts that correspond to the type of information integrated: the morphological, lexical, and system levels. The first part presents techniques that use information from a rule-based morphological tagger to do morpheme splitting of the Arabic source text. We also compare with the results of using a statistical morphological tagger. In the second part, we present two ways of using Arabic diacritics to improve SMT results, both based on binary decision trees. The third part presents a system combination method that combines the outputs of the RBMT and the SMT systems, leveraging the strength of each. This article shows how 123 68 R. Zbib et al. language specific information obtained through a deterministic rule-based process can be used to improve SMT, which is mostly language-independent.

show abstract

Faster Phrase-Based Decoding by Refining Feature State

Heafield¹,

Kayser²,

Manning³

2014

View full text Add to dashboard Cite

We contribute a faster decoding algorithm for phrase-based machine translation. Translation hypotheses keep track of state, such as context for the language model and coverage of words in the source sentence. Most features depend upon only part of the state, but traditional algorithms, including cube pruning, handle state atomically. For example, cube pruning will repeatedly query the language model with hypotheses that differ only in source coverage, despite the fact that source coverage is irrelevant to the language model. Our key contribution avoids this behavior by placing hypotheses into equivalence classes, masking the parts of state that matter least to the score. Moreover, we exploit shared words in hypotheses to iteratively refine language model scores rather than handling language model state atomically. Since our algorithm and cube pruning are both approximate, improvement can be used to increase speed or accuracy. When tuned to attain the same accuracy, our algorithm is 4.0-7.7 times as fast as the Moses decoder with cube pruning.

show abstract

Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings

Prakash¹,

Shashidhar²,

Zhao³

et al. 2020

View full text Add to dashboard Cite

The current state-of-the-art task-oriented semantic parsing models use BERT or RoBERTa as pretrained encoders; these models have huge memory footprints. This poses a challenge to their deployment for voice assistants such as Amazon Alexa and Google Assistant on edge devices with limited memory budgets. We propose to learn compositional code embeddings to greatly reduce the sizes of BERT-base and RoBERTa-base. We also apply the technique to DistilBERT, ALBERT-base, and ALBERT-large, three already compressed BERT variants which attain similar state-of-the-art performances on semantic parsing with much smaller model sizes. We observe 95.15% ∼ 98.46% embedding compression rates and 20.47% ∼ 34.22% encoder compression rates, while preserving >97.5% semantic parsing performances. We provide the recipe for training and analyze the trade-off between code embedding sizes and downstream performances.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael Kayser

The Alexa Meaning Representation Language

Deep Neural Language Models for Machine Translation

Methods for integrating rule-based and statistical systems for Arabic to English machine translation

Faster Phrase-Based Decoding by Refining Feature State

Compressing Transformer-Based Semantic Parsing Models using Compositional Code Embeddings

Contact Info

Product

Resources

About