Rachel Bawden scite author profile

For machine translation to tackle discourse phenomena, models must have access to extrasentential linguistic context. There has been recent interest in modelling context in neural machine translation (NMT), but models have been principally evaluated with standard automatic metrics, poorly adapted to evaluating discourse phenomena. In this article, we present hand-crafted, discourse test sets, designed to test the models' ability to exploit previous source and target sentences. We investigate the performance of recently proposed multi-encoder NMT models trained on subtitles for English to French. We also explore a novel way of exploiting context from the previous sentence. Despite gains using BLEU, multi-encoder models give limited improvement in the handling of discourse phenomena: 50% accuracy on our coreference test set and 53.5% for coherence/cohesion (compared to a non-contextual baseline of 50%). A simple strategy of decoding the concatenation of the previous and current sentence leads to good performance, and our novel strategy of multiencoding and decoding of two sentences leads to the best performance (72.5% for coreference and 57% for coherence/cohesion), highlighting the importance of target-side context.

show abstract

Multitask Prompted Training Enables Zero-Shot Task Generalization

Sanh¹,

Webson²,

Raffel³

et al. 2021

Preprint

104

129

View full text Add to dashboard Cite

Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language model training . Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping general natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts using varying natural language. These prompted datasets allow for benchmarking the ability of a model to perform completely unseen tasks specified in natural language. We fine-tune a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models up to 16× its size. Further, our approach attains strong performance on a subset of tasks from the BIG-Bench benchmark, outperforming models up to 6× its size. All prompts and trained models are available at github.com/bigscience-workshop/promptsource/ and huggingface.co/bigscience/T0pp.

show abstract

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Scao¹,

Fan²,

Akiki³

et al. 2022

Preprint

123

View full text Add to dashboard Cite

The University of Edinburgh’s Submissions to the WMT19 News Translation Task

Bawden¹,

Bogoychev²,

Germann³

et al. 2019

View full text Add to dashboard Cite

The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six language directions: English↔Gujarati, English↔Chinese, German→English, and English→Czech. For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data. For English↔Gujarati, we also explored semisupervised MT with cross-lingual language model pre-training, and translation pivoting through Hindi. For translation to and from Chinese, we investigated character-based tokenisation vs. sub-word segmentation of Chinese text. For German→English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018). For English→Czech, we compared different pre-processing and tokenisation regimes.

show abstract

Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies

Bawden¹,

Cohen²,

Grozea³

et al. 2019

View full text Add to dashboard Cite

The author list is alphabetical and does not reflect the respective author contributions. The task was coordinated by Mariana Neves. sions were judged to be better than the reference translations, for instance, for de/en, en/es and es/en. 10 https://github.com/glample/fastBPE 11 1 encoder layer, 1 decoder layer, both with with GRU cells, embedding dimension of 512, hidden state of dimension 1024, using layer normalization, implemented using Marian NMT and trained using the Adam optimizer.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rachel Bawden

Evaluating Discourse Phenomena in Neural Machine Translation

Multitask Prompted Training Enables Zero-Shot Task Generalization

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

The University of Edinburgh’s Submissions to the WMT19 News Translation Task

Findings of the WMT 2019 Biomedical Translation Shared Task: Evaluation for MEDLINE Abstracts and Biomedical Terminologies

Contact Info

Product

Resources

About