Jacob Bremerman scite author profile

Jacob Bremerman

5Publications

64Citation Statements Received

94Citation Statements Given

How they've been cited

117

How they cite others

100

Affiliations

University of Maryland, College Park

Publications

Order By: Most citations

The Multilingual TEDx Corpus for Speech Recognition and Translation

Salesky

Wiesner

Bremerman

et al. 2021

View full text Add to dashboard Cite

We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages. The corpus is a collection of audio recordings from TEDx talks in 8 source languages. We segment transcripts into sentences and align them to the sourcelanguage audio and target-language translations. The corpus is released along with open-sourced code enabling extension to new talks and languages as they become available. Our corpus creation methodology can be applied to more languages than previous work, and creates multi-way parallel evaluation sets. We provide baselines in multiple ASR and ST settings, including multilingual models to improve translation performance for lowresource language pairs.

show abstract

Findings of the Iwslt 2021 Evaluation Campaign

Anastasopoulos

Bojar

Bremerman³

et al. 2021

View full text Add to dashboard Cite

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation. A total of 22 teams participated in at least one of the tasks. This paper describes each shared task, data and evaluation metrics, and reports results of the received submissions.

show abstract

The Multilingual TEDx Corpus for Speech Recognition and Translation

Salesky¹,

Wiesner²,

Bremerman³

et al. 2021

Preprint

View full text Add to dashboard Cite

The JHU Submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education

Khayrallah

Bremerman

McCarthy

et al. 2020

View full text Add to dashboard Cite

This paper presents the Johns Hopkins University submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education (STAPLE). We participated in all five language tasks, placing first in each. Our approach involved a language-agnostic pipeline of three components: (1) building strong machine translation systems on general-domain data, (2) finetuning on Duolingo-provided data, and (3) generating n-best lists which are then filtered with various score-based techniques. In addition to the language-agnostic pipeline, we attempted a number of linguistically-motivated approaches, with, unfortunately, little success. We also find that improving BLEU performance of the beam-search generated translation does not necessarily improve on the task metric-weighted macro F1 of an n-best list. Task DescriptionData We use data provided by the STAPLE shared task (Mayhew et al., 2020). This data consists of a single English prompt sentence or phrase paired with multiple translations in the target lan-

show abstract

Machine Translation Robustness to Natural Asemantic Variation

Bremerman¹,

Ren²,

May³

2022

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jacob Bremerman

The Multilingual TEDx Corpus for Speech Recognition and Translation

Findings of the Iwslt 2021 Evaluation Campaign

The Multilingual TEDx Corpus for Speech Recognition and Translation

The JHU Submission to the 2020 Duolingo Shared Task on Simultaneous Translation and Paraphrase for Language Education

Machine Translation Robustness to Natural Asemantic Variation

Contact Info

Product

Resources

About