Proceedings of the Fourth Arabic Natural Language Processing Workshop 2019
DOI: 10.18653/v1/w19-4631
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Deep Learning for Arabic Dialect Identification

Abstract: In this paper, we present two approaches for Arabic Fine-Grained Dialect Identification. The first approach is based on Recurrent Neural Networks (BLSTM, BGRU) using hierarchical classification. The main idea is to separate the classification process for a sentence from a given text in two stages. We start with a higher level of classification (8 classes) and then the finer-grained classification (26 classes). The second approach is given by a voting system based on Naive Bayes and Random Forest. Our system ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 6 publications
0
2
0
Order By: Relevance
“…Many attempts have been proposed in the area of automatic dialect identification (ADI), and early uses are based on dictionaries, rules, and language modeling [5][6][7][8][9][10]; more recently, a shift was made toward employing machine learning techniques [11][12][13][14][15][16][17][18][19][20][21][22][23][24], deep learning approaches [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40], and transfer learning methods [41][42][43][44][45][46][47][48][49]. Many of these investigations utilize prominent and accessible datasets, such as MADAR [49], NADI [50][51]…”
Section: Introductionmentioning
confidence: 99%
“…Many attempts have been proposed in the area of automatic dialect identification (ADI), and early uses are based on dictionaries, rules, and language modeling [5][6][7][8][9][10]; more recently, a shift was made toward employing machine learning techniques [11][12][13][14][15][16][17][18][19][20][21][22][23][24], deep learning approaches [25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40], and transfer learning methods [41][42][43][44][45][46][47][48][49]. Many of these investigations utilize prominent and accessible datasets, such as MADAR [49], NADI [50][51]…”
Section: Introductionmentioning
confidence: 99%
“…MADAR has been established as an important corpus for the task, serving as a benchmark for multi-task learning (Seelawi et al, 2021), as well as a Shared Task corpus (Bouamor et al, 2019), and as a subject of independent research (Baimukan et al, 2022). Despite several attempts to develop models using deep neural networks (Lippincott et al, 2019;de Francony et al, 2019) and pre-trained Transformer-based language models (Inoue et al, 2021), the current state-of-theart approach remains a statistical machine learning model with surface-level feature representation, specifically the Multinomial Naive Bayes (MNB) model introduced by .…”
Section: Introductionmentioning
confidence: 99%