“…For NER, we use NL, EN, and DE datasets from CoNLL-2002 and2003 challenges (Tjong Kim Sang, 2002;Tjong Kim Sang and De Meulder, 2003). Additionally, we use the People's Daily dataset 4 , iob2corpus 5 , AQMAR (Mohit et al, 2012), ArmanPerosNERCorpus (Poostchi et al, 2016), MK-PUCIT (Kanwal et al, 2020), and a news-based NER dataset (Mordecai and Elhadad, 2012) for the languages CN, JA, AR, FA, UR, and HE, respectively. Since the NER datasets are individually constructed in each language, their label sets do not fully agree.…”