Proceedings of the the Fourth Widening Natural Language Processing Workshop 2020
DOI: 10.18653/v1/2020.winlp-1.5
|View full text |Cite
|
Sign up to set email alerts
|

Large Vocabulary Read Speech Corpora for Four Ethiopian Languages: Amharic, Tigrigna, Oromo, and Wolaytta

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…If the homogeneity assumption is proven, the researcher can proceed to the advanced data analysis stage. In this homogeneity test, IBM SPSS Statistics 22 is used (Abate et al, 2020).…”
Section: Homogeneity Testmentioning
confidence: 99%
“…If the homogeneity assumption is proven, the researcher can proceed to the advanced data analysis stage. In this homogeneity test, IBM SPSS Statistics 22 is used (Abate et al, 2020).…”
Section: Homogeneity Testmentioning
confidence: 99%
“…The most widely spoken language in Ethiopia is Afaan Oromo, which has a 33.8% speaker followed by a 29.3% speaker for Amharic [19]. Afaan Oromo belongs to the Cushitic language family group of the Afroasiatic language family, while Amharic is a member of the Semitic language family group [20]. After Arabic and Hausa, Afaan Oromo is the third most frequently used Afro-Asiatic language in the world [21].…”
Section: Introductionmentioning
confidence: 99%
“…In the literature, different studies have been carried out to solve the problem of a large vocabulary. First of all, the creation of a corpus with a large vocabulary was studied and ASR systems with a large vocabulary were developed [15,16]. However, a balanced Turkish dataset of spontaneous conversations and conversations in different fields is not currently available.…”
Section: Introductionmentioning
confidence: 99%