Simultaneous translation, which translates sentences before they are finished, is useful in many scenarios but is notoriously difficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we propose a novel prefix-to-prefix framework for simultaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very simple yet surprisingly effective "wait-k" policy trained to generate the target sentence concurrently with the source sentence, but always k words behind. Experiments show our strategy achieves low latency and reasonable quality (compared to full-sentence translation) on 4 directions: zh↔en and de↔en. * M.M. and L.H. contributed equally; L.H. conceived the main ideas (prefix-to-prefix and wait-k) and directed the project, while M.M. led the implementations on RNN and Transformer. See example videos, media reports, code, and data at https://simultrans-demo.github.io/.
President Bush met with Putin in MoscowBùshí Bush zǒngtǒng President zài at Mòsīkē Moscow yǔ with Pǔjīng Putin huìwù meet prediction read write Source side → Target side → 2 Preliminaries: Full-Sentence NMT We first briefly review standard (full-sentence) neural translation to set up the notations.Regardless of the particular design of different seq-to-seq models, the encoder always takes