“…At the same time, current research within big data organization mainly focuses on building systems and models using the English language (Kaity and Balakrishnan, 2020), despite the enormity of texts available in multiple other languages. Consequently, it becomes necessary to deal with multi-lingual data for many big data analysis tasks; for example, sentiment classification (Pessutto et al , 2020), topic analysis (Xie et al , 2020), price forecasting (Li et al , 2020a, 2020b) and so forth. Taking the task of entity relation extraction (ERE) as an example, there are abundant annotated corpus in English, while in other language contexts, tagged corpus is relatively scarce, and manual tagging for each language is an expensive and time-consuming task, especially for low resource languages (Catelli et al , 2020; Kim et al , 2014).…”