From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Aguilar, Gustavo; Solorio, Thamar

doi:10.18653/v1/2020.acl-main.716

Cited by 29 publications

(16 citation statements)

References 19 publications

(31 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Code-mixed text processing. Previous research on code-mixed text processing focused on constructing formal grammars (Joshi, 1982) and tokenlevel language identification (Bali et al, 2014;Solorio et al, 2014;Barman et al, 2014), before progressing to named entity recognition and part-of-speech tagging (Ball and Garrette, 2018;AlGhamdi and Diab, 2019;Aguilar and Solorio, 2020). Recent work explores code-mixing in higher-level tasks such as question answering and task-oriented dialogue Ahn et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

Tan¹,

Joty²

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Multilingual models have demonstrated impressive cross-lingual transfer performance. However, test sets like XNLI are monolingual at the example level. In multilingual communities, it is common for polyglots to code-mix when conversing with each other. Inspired by this phenomenon, we present two strong blackbox adversarial attacks (one word-level, one phrase-level) for multilingual models that push their ability to handle code-mixed sentences to the limit. The former uses bilingual dictionaries to propose perturbations and translations of the clean example for sense disambiguation. The latter directly aligns the clean example with its translations before extracting phrases as perturbations. Our phrase-level attack has a success rate of 89.75% against XLM-R large , bringing its average accuracy of 79.85 down to 8.18 on XNLI. Finally, we propose an efficient adversarial training scheme that trains in the same number of steps as the original model and show that it improves model accuracy. 1 Original P: The girl that can help me is all the way across town. H: There is no one who can help me. Adversary P: olan girl that can help me is all the way across town. H: one who can help me. Prediction Before: Contradiction After: Entailment Original P: We didn't know where they were going. H: We didn't know where the people were traveling to. Adversary P: We didn't know where they were going. H: We didn't know where les gens allaient. Prediction Before: Entailment After: Neutral Original P: Well it got to where there's two or three aircraft arrive in a week and I didn't know where they're flying to.H: There are never any aircraft arriving. Adversary P: общем, дошло до mahali there's two or three aircraft arrive in a week and I didn't know where they're flying to.

show abstract

Section: Related Workmentioning

confidence: 99%

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

Tan¹,

Joty²

2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…Linguistic code-switching is a phenomenon where multilingual speakers alternate between languages. Recently, monolingual models have been adapted to code-switched text in entity recognition (Aguilar and Solorio, 2019), part-ofspeech tagging (Soto and Hirschberg, 2018;Ball and Garrette, 2018), sentiment analysis (Joshi et al, 2016) and language identification (Mave et al, 2018;Yirmibeşoglu and Eryigit, 2018;Mager et al, 2019). Recently, KhudaBukhsh et al, 2020 have proposed an approach to sample code-mixed documents using minimal supervision.…”

Section: Related Workmentioning

confidence: 99%

Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent Prediction and Slot Filling

Krishnan¹,

Anastasopoulos²,

Purohit³

et al. 2021

Proceedings of the 1st Workshop on Multilingual Representation Learning

View full text Add to dashboard Cite

Predicting user intent and detecting the corresponding slots from text are two key problems in Natural Language Understanding (NLU). Since annotated datasets are only available for a handful of languages, our work focuses particularly on a zero-shot scenario where the target language is unseen during training. In the context of zero-shot learning, this task is typically approached using representations from pre-trained multilingual language models such as mBERT or by fine-tuning on data automatically translated into the target language. We propose a novel method which augments monolingual source data using multilingual code-switching via random translations, to enhance generalizability of large multilingual language models when fine-tuning them for downstream tasks. Experiments on the Mul-tiATIS++ benchmark show that our method leads to an average improvement of +4.2% in accuracy for the intent task and +1.8% in F1 for the slot-filling task over the state-of-the-art across 8 typologically diverse languages. We also study the impact of code-switching into different families of languages on downstream performance. Furthermore, we present an application of our method for crisis informatics using a new human-annotated tweet dataset of slot filling in English and Haitian Creole, collected during the Haiti earthquake. 1

show abstract

“…Bi-LSTM CRF (I): This is the most commonly used model for the LI and POS tasks. It uses Bi-LSTM and CRF consecutively for POS tagging the CM text (Aguilar and Solorio, 2019;Bhattu et al, 2020a).…”

Section: Fcrf (J)mentioning

confidence: 99%

A Pre-trained Transformer and CNN model with Joint Language ID and Part-of-Speech Tagging for Code-Mixed Social-Media Text

Dowlagar

Mamidi

2021

Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Me

View full text Add to dashboard Cite

Code-mixing (CM) is a frequently observed phenomenon that uses multiple languages in an utterance or sentence. There are no strict grammatical constraints observed in codemixing, and it consists of non-standard variations of spelling. The linguistic complexity resulting from the above factors made the computational analysis of the code-mixed language a challenging task. Language identification (LI) and part of speech (POS) tagging are the fundamental steps that help analyze the structure of the code-mixed text. Often, the LI and POS tagging tasks are interdependent in the code-mixing scenario. We project the problem of dealing with multilingualism and grammatical structure while analyzing the code-mixed sentence as a joint learning task. In this paper, we jointly train and optimize language detection and part of speech tagging models in the code-mixed scenario. We used a Transformer with convolutional neural network architecture. We train a joint learning method by combining POS tagging and LI models on code-mixed social media text obtained from the ICON shared task.

show abstract

From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Cited by 29 publications

References 19 publications

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

Code-Mixing on Sesame Street: Dawn of the Adversarial Polyglots

Multilingual Code-Switching for Zero-Shot Cross-Lingual Intent Prediction and Slot Filling

A Pre-trained Transformer and CNN model with Joint Language ID and Part-of-Speech Tagging for Code-Mixed Social-Media Text

Contact Info

Product

Resources

About