2019
DOI: 10.3390/info10090268
|View full text |Cite
|
Sign up to set email alerts
|

The Usefulness of Imperfect Speech Data for ASR Development in Low-Resource Languages

Abstract: When the National Centre for Human Language Technology (NCHLT) Speech corpus was released, it created various opportunities for speech technology development in the 11 official, but critically under-resourced, languages of South Africa. Since then, the substantial improvements in acoustic modeling that deep architectures achieved for well-resourced languages ushered in a new data requirement: their development requires hundreds of hours of speech. A suitable strategy for the enlargement of speech resources for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 24 publications
(31 reference statements)
0
14
0
Order By: Relevance
“… The data contains additional examples of speech and provides more examples of languages recorded in varying acoustic conditions. It has been shown that combining this data set with existing speech data improves the recognition accuracy of ASR systems [2] . This is an important consideration, because there are currently almost no other resources available for speech technology development in South Africa.…”
Section: Value Of the Datamentioning
confidence: 99%
“… The data contains additional examples of speech and provides more examples of languages recorded in varying acoustic conditions. It has been shown that combining this data set with existing speech data improves the recognition accuracy of ASR systems [2] . This is an important consideration, because there are currently almost no other resources available for speech technology development in South Africa.…”
Section: Value Of the Datamentioning
confidence: 99%
“…This can present one of the main research field challenges. It is known [1,31] from artificial intelligence and the Internet of Things that open data can stimulate the research area's development.…”
Section: Audio Datamentioning
confidence: 99%
“…The computation complexity of the loss function is (| |). Note, however, that the element-wise multiplication, division, and log operation can be implemented in parallel with graphical parallel unit (GPU) at (1). In contrast, the implementation of CTC [48] based on a forward-backward algorithm has a computation complexity of ( • ), here is the sequence length and is the number of lokkahead steps.…”
Section: Computational Complexity Analysismentioning
confidence: 99%
“…Preservation of endangered languages may require speech recording, processing, and automatic recognition. Since the early days of modern computer science, automatic speech recognition (ASR) has been one of the biggest and hardest challenges in the field that requires huge volumes of speech data [1]. Over the years, a large majority of research conducted in this area focused on the most widely spoken languages, such as English, French, Mandarin Chinese, etc.…”
Section: Introductionmentioning
confidence: 99%