Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.69
|View full text |Cite
|
Sign up to set email alerts
|

When is Char Better Than Subword: A Systematic Study of Segmentation Algorithms for Neural Machine Translation

Abstract: Subword segmentation algorithms have been a de facto choice when building neural machine translation systems. However, most of them need to learn a segmentation model based on some heuristics, which may produce suboptimal segmentation. This can be problematic in some scenarios when the target language has rich morphological changes or there is not enough data for learning compact composition rules. Translating at fully character level has the potential to alleviate the issue, but empirical performances of char… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 18 publications
2
7
0
Order By: Relevance
“…The results are presented in Table 1. Except for translation into Arabic (which is consistent with the findings of Levy, 2021a andLi et al, 2021), where character methods outperform BPEs, subword methods are always better than characters.…”
Section: Translation Qualitysupporting
confidence: 84%
See 3 more Smart Citations
“…The results are presented in Table 1. Except for translation into Arabic (which is consistent with the findings of Levy, 2021a andLi et al, 2021), where character methods outperform BPEs, subword methods are always better than characters.…”
Section: Translation Qualitysupporting
confidence: 84%
“…Therefore, we assume that, if character-level models were a fully-fledged alternative to subword models, at least some systems submitted to the shared tasks would use character-level models. Li et al (2021) evaluated domain robustness by training models on small domain-specific datasets and evaluating them on unrelated domains, claiming the superiority of character-level models in this setup. We argue that this is a very unnatural setup and rather evaluate the domain robustness by evaluating general models on domain-specific test sets.…”
Section: Wmt Submissionsmentioning
confidence: 99%
See 2 more Smart Citations
“…For NMT, several papers have claimed parity of character-based methods with subword models, highlighting advantageous features of such systems. Very recent examples include Gao et al (2020); Banar et al (2020); Li et al (2021). Despite this, character-level methods are rarely used as strong baselines in research papers and shared task submissions, suggesting that character-level models might have drawbacks that are not sufficiently addressed in the literature.…”
Section: Introductionmentioning
confidence: 99%