Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching 2018
DOI: 10.18653/v1/w18-3219
|View full text |Cite
|
Sign up to set email alerts
|

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task

Abstract: In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data. We divide the shared task into two competitions based on the English-Spanish (ENG-SPA) and Modern Standard Arabic-Egyptian (MSA-EGY) language pairs. We use Twitter data and 9 entity types to establish a new dataset for code-switched NER benchmarks. In addition to the CS phenomenon, the diversity of the entities and the social medi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
68
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 60 publications
(69 citation statements)
references
References 24 publications
0
68
0
1
Order By: Relevance
“…Besides, CRF mainly focuses on the transition probability of each word, and it does not pay enough attention to the name entity attributes of the word. Now, most of the NER methods are based on sequence labelling [12,16,17,18,19,20]. These methods classify every word in the corpus into different categories.…”
Section: Named Entity Recognitionmentioning
confidence: 99%
“…Besides, CRF mainly focuses on the transition probability of each word, and it does not pay enough attention to the name entity attributes of the word. Now, most of the NER methods are based on sequence labelling [12,16,17,18,19,20]. These methods classify every word in the corpus into different categories.…”
Section: Named Entity Recognitionmentioning
confidence: 99%
“…For our experiment, we use English-Spanish tweets data provided by Aguilar et al (2018). There are nine entity labels.…”
Section: Experiments 41 Datasetmentioning
confidence: 99%
“…Code-Switching is produced in both written text and speech in a discourse. Recent studies in code-switching has been mainly focused on natural language tasks, such as language modeling (Winata et al, 2018a;Pratapa et al, 2018;Garg et al, 2018), named entity recognition (Aguilar et al, 2018), and language identification (Solorio et al, 2014;Molina et al, 2016;Barman et al, 2014). Code-Switching is considered as a challenging task because words from different languages may co-exist within a sequence, and models are required to recognize the context of mixed-language sentences.…”
Section: Introductionmentioning
confidence: 99%
“…2016; Aguilar et al . 2018), and distinguishing between very similar languages, or varieties and/or dialects of the same language, as for example, pursued in the VarDial/DSL series of workshops and shared tasks (Zampieri et al . 2014; Zampieri et al .…”
Section: Introductionmentioning
confidence: 99%