2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019
DOI: 10.1109/asru46091.2019.9003960
|View full text |Cite
|
Sign up to set email alerts
|

The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech

Abstract: This paper describes the fifth edition of the Multi-Genre Broadcast Challenge (MGB-5), an evaluation focused on Arabic speech recognition and dialect identification. MGB-5 extends the previous MGB-3 challenge in two ways: first it focuses on Moroccan Arabic speech recognition; second the granularity of the Arabic dialect identification task is increased from 5 dialect classes to 17, by collecting data from 17 Arabic speaking countries. Both tasks use YouTube recordings to provide a multi-genre multi-dialectal … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 34 publications
(29 citation statements)
references
References 22 publications
0
29
0
Order By: Relevance
“…[202]), there has been less work on accent and dialect adaptive speech recognition systems. The MGB-3 [203] and MGB-5 [204] evaluation challenges have used dialectal Arabic test sets, with a modern standard Arabic (MSA) training set, using broadcast and internet video data. The best results reported on these challenges have used a straightforward model-based transfer learning approach in an LF-MMI framework, adapting MSA trained baseline systems to specific Arabic dialects [205], [206].…”
Section: Accent Adaptationmentioning
confidence: 99%
“…[202]), there has been less work on accent and dialect adaptive speech recognition systems. The MGB-3 [203] and MGB-5 [204] evaluation challenges have used dialectal Arabic test sets, with a modern standard Arabic (MSA) training set, using broadcast and internet video data. The best results reported on these challenges have used a straightforward model-based transfer learning approach in an LF-MMI framework, adapting MSA trained baseline systems to specific Arabic dialects [205], [206].…”
Section: Accent Adaptationmentioning
confidence: 99%
“…There is relatively few research works on ASR for Algerian dialects in order to be able to compare our obtained results. However, in the last edition of the MGB challenge, MGB5 [27], there was a task about ASR for Moroccan dialect, which is relatively close to the Algerian dialect because they share several linguistic and acoustic aspects. The best system obtained a WER of 37.6%, knowing that 13 hours of dialectal speech were used with 1200 hours of MSA to train the acoustic model.…”
Section: Transfer Learningmentioning
confidence: 99%
“…Despite Arabic being one of the most popular languages (ranking 6 ℎ according to total number of speakers [37]) and the prevalence of CS across Arab countries [13,46,12], there is a huge research gap in the field of the Arabic CS ASR. While many researchers have worked on ASR for Modern Standard Arabic [38,58,72,9,25,5] and dialectal Arabic [59,60,89,40,5,8,75,10,6,7], work on CS Arabic ASR is still in its initial stages. In [18], Bayeh et al presented an ASR system for Arabic Broadcast News (BN), where they rely on crosslingual techniques to allow the baseline systems trained on MSA data to recognize dialectal languages (mainly Levantine/Maghrebian) as well as French embedded language.…”
Section: Arabic Cs Asrmentioning
confidence: 99%