Kyunghyun Cho scite author profile

In this paper, we propose a novel neural network model called RNN EncoderDecoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder-Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

show abstract

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Cho¹,

Merriënboer²,

Gülçehre³

et al. 2014

Preprint

3,309

2,679

View full text Add to dashboard Cite

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Lee

Cho

Hofmann

2017

TACL

327

322

View full text Add to dashboard Cite

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subwordlevel encoder on WMT'15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single characterlevel encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.

show abstract

Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines

2011

View full text Add to dashboard Cite

Dynamic Meta-Embeddings for Improved Sentence Representations

2018

View full text Add to dashboard Cite

While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to stateof-the-art performance within the same model class on a variety of tasks. We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems.• Multi-domain Standard word embeddings

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kyunghyun Cho

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines

Dynamic Meta-Embeddings for Improved Sentence Representations

Contact Info

Product

Resources

About