2021
DOI: 10.48550/arxiv.2110.02887
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Using Optimal Transport as Alignment Objective for fine-tuning Multilingual Contextualized Embeddings

Abstract: Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces. For contextualized embeddings, alignment becomes more complex as we additionally take context into consideration. In this work, we propose using Optimal Transport (OT) as an alignment objective during fine-tuning to further improve multilingual contextualized representations for downstream cross-lingual transfer. This … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 22 publications
(32 reference statements)
0
2
0
Order By: Relevance
“…For tasks involving cross-lingual settings, Nguyen and Luu [29] employed OT distance as a part of the loss function in a knowledge distillation framework for improving the cross-lingual summarization. Alqahtani et al [1] incorporated OT as an alignment objective to improve the multilingual word representations. In this work, we explore transferring the retrieval knowledge in a cross-lingual setting via OT.…”
Section: Optimal Transportmentioning
confidence: 99%
See 1 more Smart Citation
“…For tasks involving cross-lingual settings, Nguyen and Luu [29] employed OT distance as a part of the loss function in a knowledge distillation framework for improving the cross-lingual summarization. Alqahtani et al [1] incorporated OT as an alignment objective to improve the multilingual word representations. In this work, we explore transferring the retrieval knowledge in a cross-lingual setting via OT.…”
Section: Optimal Transportmentioning
confidence: 99%
“…Therefore, we approximate the calculation of 𝐷 ( q, π‘ž) as an optimal transport problem. First, we assign equal mass to the tokens in q and π‘ž by defining a uniform source probability distribution, πœ‡ 𝑠 , on q and a uniform target probability distribution, πœ‡ 𝑑 , on π‘ž: πœ‡ 𝑠 (𝑖) = 1 𝐿 and πœ‡ 𝑑 ( 𝑗) = 1 𝐿 where 1 ≀ 𝑖, 𝑗 ≀ 𝐿. The set of transportation plans between these two distributions is then the set of doubly stochastic matrices P defined as…”
Section: Optimal Transport Knowledge Distillationmentioning
confidence: 99%