Telmo Pires scite author profile

In this paper, we show that Multilingual BERT (M-BERT), released by Devlin et al. (2019) as a single language model pre-trained from monolingual corpora in 104 languages, is surprisingly good at zero-shot cross-lingual model transfer, in which task-specific annotations in one language are used to fine-tune the model for evaluation in another language. To understand why, we present a large number of probing experiments, showing that transfer is possible even to languages in different scripts, that transfer works best between typologically similar languages, that monolingual corpora can train models for code-switching, and that the model can find translation pairs. From these results, we can conclude that M-BERT does create multilingual representations, but that these representations exhibit systematic deficiencies affecting certain language pairs.

show abstract

How multilingual is Multilingual BERT?

Pires

Schlinger

Garrette

2019

Preprint

120

View full text Add to dashboard Cite

Embedded System for Detecting and Georeferencing Holes in Roads

Borges¹,

Carvalho

Pires

et al. 2011

IEEE Latin Am. Trans.

View full text Add to dashboard Cite

show abstract

End-to-End Speech Translation for Code Switched Speech

Weller¹,

Sperber²,

Pires³

et al. 2022

View full text Add to dashboard Cite

Code switching (CS) refers to the phenomenon of interchangeably using words and phrases from different languages. CS can pose significant accuracy challenges to NLP, due to the often monolingual nature of the underlying systems. In this work, we focus on CS in the context of English/Spanish conversations for the task of speech translation (ST), generating and evaluating both transcript and translation. To evaluate model performance on this task, we create a novel ST corpus derived from existing public data sets. 1 We explore various ST architectures across two dimensions: cascaded (transcribe then translate) vs end-toend (jointly transcribe and translate) and unidirectional (source → target) vs bidirectional (source ↔ target). We show that our ST architectures, and especially our bidirectional end-to-end architecture, perform well on CS speech, even when no CS training data is used.

show abstract

End-to-End Speech Translation for Code Switched Speech

Weller¹,

Sperber²,

Pires³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Telmo Pires

How Multilingual is Multilingual BERT?

How multilingual is Multilingual BERT?

Embedded System for Detecting and Georeferencing Holes in Roads

End-to-End Speech Translation for Code Switched Speech

End-to-End Speech Translation for Code Switched Speech

Contact Info

Product

Resources

About