2021
DOI: 10.1017/s1351324921000115
|View full text |Cite
|
Sign up to set email alerts
|

SwitchNet: Learning to switch for word-level language identification in code-mixed social media text

Abstract: Word-level language identification is an essential prerequisite for extracting useful information from code-mixed social media content. Previous studies in word-level language identification show two important observations. First, the local context is an important indicator of the language of a word when a word is valid in multiple languages. Second, considering the word in isolation from its context leads to more effective language classification when a word is borrowed or embedded into sentences of other lan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 39 publications
0
10
0
Order By: Relevance
“…The convolution layer in CNN building blocks extracts text features by applying a convolutional filter or kernel to each window in the sequence of text. Sarma et al [63] experimented with CNN and BLSTM, and CNN has shown the best performance among the other techniques with an F1 score of 91.03%. Some studies combined two ANN modules, CNN and LSTM or CNN and BLSTM, in their neural network architecture.…”
Section: 1) Machine Learning Approachmentioning
confidence: 99%
See 3 more Smart Citations
“…The convolution layer in CNN building blocks extracts text features by applying a convolutional filter or kernel to each window in the sequence of text. Sarma et al [63] experimented with CNN and BLSTM, and CNN has shown the best performance among the other techniques with an F1 score of 91.03%. Some studies combined two ANN modules, CNN and LSTM or CNN and BLSTM, in their neural network architecture.…”
Section: 1) Machine Learning Approachmentioning
confidence: 99%
“…The ambiguity existed in several LID of code-mixed text studies, for example, Punjabi-English [16], Hindi-English [15,37,61], Malayalam-English [64], Bengali-English [37], Gujarati-English [37,59], Spanish-English [15,38,42,72], Dutch-English [20], Turkish-German & Spanish-Wixarika [22], Modern Standard Arabic-Arabic [42,72], Konkani-English [45], Swahili-English [40], English-Assamese-Hindi-Bengali [63], Sinhala-English [56]. The annotation of mixed languages becomes increasingly complicated when the languages are closely related [16].…”
Section: ) Ambiguitymentioning
confidence: 99%
See 2 more Smart Citations
“…Neelakshi Sarma et al have developed a framework for language identification which is capable of recognizing words that are borrowed from other languages and used in multiple languages and predict the language. This framework also considers the context of the sentence [11].…”
Section: Related Workmentioning
confidence: 99%