2021
DOI: 10.48550/arxiv.2107.03844
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

Abstract: Bangla -ranked as the 6 𝑡ℎ most widely spoken language across the world, 1 with 230 million native speakersis still considered as a low-resource language in the natural language processing (NLP) community. With three decades of research, Bangla NLP (BNLP) is still lagging behind mainly due to the scarcity of resources and the challenges that come with it. There is sparse work in different areas of BNLP; however, a thorough survey reporting previous work and recent advances is yet to be done. In this study, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
15
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(15 citation statements)
references
References 63 publications
0
15
0
Order By: Relevance
“…Early research includes rule-based and classical methodologies whereas recent studies include deep learning-based and pretrained language models. Researchers have been trying to develop resources over time and as a result, manual and semi-supervised approaches (Chowdhury and Chowdhury, 2014;Alam et al, 2021;Islam et al, 2021Islam et al, , 2023Kabir et al, 2023) have been adopted in developing sentiment classification datasets. Chowdhury and Chowdhury (2014) used a semi-supervised technique to annotate data and train classical models.…”
Section: Literature Reviewmentioning
confidence: 99%
See 3 more Smart Citations
“…Early research includes rule-based and classical methodologies whereas recent studies include deep learning-based and pretrained language models. Researchers have been trying to develop resources over time and as a result, manual and semi-supervised approaches (Chowdhury and Chowdhury, 2014;Alam et al, 2021;Islam et al, 2021Islam et al, , 2023Kabir et al, 2023) have been adopted in developing sentiment classification datasets. Chowdhury and Chowdhury (2014) used a semi-supervised technique to annotate data and train classical models.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The interest in low-resource languages is growing over time in sentiment analysis (Batanović et al, 2016;Nabil et al, 2015;Muhammad et al, 2023). Unlike other languages, a limited number † The authors contributed equally to this work of study has been done to develop resources for Bangla sentiment analysis (Hasan et al, 2020a;Alam et al, 2021;Islam et al, 2021;Hasan et al, 2023b;Islam et al, 2023). From the perspective of modeling, there have been studied both classical (i.e., SVM, RF, Naive Bayes) and deep learning (i.e., CNN, LSTM) models.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Some exciting yet typical applications of textual NLP-based tools are text classifier, translator, entity identifier, summarizer, answering question system. Automated text generation providing some source text is a crucial component of many of those NLP systems where it produces coherent human-like text as output [3]. Sequence-to-Sequence, shortly known as Seq2Seq, is an algorithmic model that takes an input sequence, processes it, produces an output sequence, and text generation problems can be modeled using it.…”
Section: Introductionmentioning
confidence: 99%