2020
DOI: 10.20944/preprints202006.0223.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

<strong>Cooking is All About People: Comment Classification on Cookery Channels Using Bert and Classification Models (Malayalam-English Mix-Code)</strong>

Abstract: The scope of a lucrative career promoted by Google through its video distribution platform YouTube 1 has attracted a large number of users to become content creators. An important aspect of this line of work is the feedback received in the form of comments which show how well the content is being received by the audience. However, volume of comments coupled with spam and limited tools for comment classification makes it virtually impossible for a creator to go through each and every comment and gather construc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 21 publications
0
5
0
Order By: Relevance
“…Most recently, fine-tuning pre-trained monolingual models such as BERT and multilingual models such as XLM-R and mBERT (Hande et al, 2021b) have been used. PLMs outperformed other Deep Learning and Machine Learning techniques in some studies (Chakravarthi et al, 2020;Aguilar et al, 2020;Khanuja et al, 2020), while they performed poorly in some others (Kazhuparambil and Kaushik, 2020). But according to Kazhuparambil and Kaushik (2020), PLMs can be made the top-performing models for CMCS data classification by optimizing hyper-parameters.…”
Section: Cmcs Text Classificationmentioning
confidence: 98%
“…Most recently, fine-tuning pre-trained monolingual models such as BERT and multilingual models such as XLM-R and mBERT (Hande et al, 2021b) have been used. PLMs outperformed other Deep Learning and Machine Learning techniques in some studies (Chakravarthi et al, 2020;Aguilar et al, 2020;Khanuja et al, 2020), while they performed poorly in some others (Kazhuparambil and Kaushik, 2020). But according to Kazhuparambil and Kaushik (2020), PLMs can be made the top-performing models for CMCS data classification by optimizing hyper-parameters.…”
Section: Cmcs Text Classificationmentioning
confidence: 98%
“…The most notable one is the Spanish-English and Hindi-English code-mixed datasets released for the Se-mEval 2020 task (Javdan et al, 2020). Other than that, there are code-mixed datasets available between Malayalam-English (Kazhuparambil and Kaushik, 2020), and Tamil-English (Chakravarthi et al, 2020). Interestingly, other than the Spanish-English dataset, all the other datasets we identified involve Indic languages.…”
Section: Classifying Code-mixed Datamentioning
confidence: 99%
“…Interestingly, other than the Spanish-English dataset, all the other datasets we identified involve Indic languages. The Malayalam-English dataset was obtained from A food recipe dataset with about 3000 samples Kazhuparambil and Kaushik (2020). Figure 1 and the Table 1 show the class distribution and few samples of the Malayalam-English dataset, respectively.…”
Section: Classifying Code-mixed Datamentioning
confidence: 99%
See 1 more Smart Citation
“…There exist many challenges when performing sentiment analysis on the multilingual text as explained in [28], [29], [30]. The first challenge is to perform language identification.…”
Section: Challenges With Multilingual Textmentioning
confidence: 99%