Multilingual BERT has been shown to generalize well in a zero-shot crosslingual setting. This generalization was measured on POS and NER tasks. We explore the multilingual BERT cross-language transferability on the reading comprehension task. We compare different modes of training of question-answering model for a non-English language using both English and language-specific data. We demonstrate that the model based on multilingual BERT is slightly behind the monolingual BERT-based on Russian data, however, it achieves comparable results with the language-specific variant on Chinese. We also show that training jointly on English data and additional 10,000 monolingual samples allows it to reach the performance comparable to the one trained on monolingual data only.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.