Proceedings of the 11th Forum for Information Retrieval Evaluation 2019
DOI: 10.1145/3368567.3368578
|View full text |Cite
|
Sign up to set email alerts
|

Language Identification of Bengali-English Code-Mixed Data using Character & Phonetic based LSTM Models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 5 publications
0
7
0
Order By: Relevance
“…Another RNN variation technique, LSTM, has shown satisfactory performance in identifying Hindi-English and Bengali-English code-mixed text [27,52,54]. In [52], the LSTM architecture could give a high average F1 score of 93.4% and an average accuracy of 96.1% across the three classes.…”
Section: 1) Machine Learning Approachmentioning
confidence: 99%
See 3 more Smart Citations
“…Another RNN variation technique, LSTM, has shown satisfactory performance in identifying Hindi-English and Bengali-English code-mixed text [27,52,54]. In [52], the LSTM architecture could give a high average F1 score of 93.4% and an average accuracy of 96.1% across the three classes.…”
Section: 1) Machine Learning Approachmentioning
confidence: 99%
“…The following we identified some non-standard words encountered from the investigated papers. We categorised the non-standard words into four types, such as non-standard spelling [7,15,56], abbreviated words [3,37,39,45,49,56,64], exaggerated words [3,7,27,39,45,47,[49][50][51]64], and mixing characters with numbers or special characters [3,27,39,50]. Table 6 describes some examples of non-standard words found in code-mixed text LID.…”
Section: ) Non-standard Wordsmentioning
confidence: 99%
See 2 more Smart Citations
“…Given an annotated corpus, the output of the language models is combined with other features to train a word-level language classifier. In addition to character n-gram information, phonetic information has also been used for word language identification in Das et al (2019).…”
Section: Related Studiesmentioning
confidence: 99%