2019
DOI: 10.15625/1813-9663/34/4/13165
|View full text |Cite
|
Sign up to set email alerts
|

Development of High-Performance and Large-Scale Vietnamese Automatic Speech Recognition Systems

Abstract: Automatic Speech Recognition (ASR) systems convert human speech into the corresponding transcription automatically. They have a wide range of applications such as controlling robots, call center analytics, voice chatbot. Recent studies on ASR for English have achieved the performance that surpasses human ability. The systems were trained on a large amount of training data and performed well under many environments. With regards to Vietnamese, there have been many studies on improving the performance of existin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 9 publications
0
7
0
Order By: Relevance
“…Table 1. The differences among tones in Vietnamese are illustrated in Table 2 [7]. The monosyllabic nature of Vietnamese, combined with its tonal system, adds a layer of complexity to the language.…”
Section: Characteristics Of Vietnamese Languagementioning
confidence: 99%
See 4 more Smart Citations
“…Table 1. The differences among tones in Vietnamese are illustrated in Table 2 [7]. The monosyllabic nature of Vietnamese, combined with its tonal system, adds a layer of complexity to the language.…”
Section: Characteristics Of Vietnamese Languagementioning
confidence: 99%
“…All audio files were converted to the wave format with a sampling frequency of 16 kHz and PCM 16 bits. In [7], three Vietnamese speech corpora have been introduced. Those corpora include two small reading speech corpora with a total of 6 h and 6.5 h, respectively, and a large-scale speech corpus with 900 h. The large-scale speech corpus was collected by crawling untranscripted audio from various resources, such as movies, YouTube movies, and electronic newspapers.…”
Section: Previous Work On Vietnamese Speech Corpusmentioning
confidence: 99%
See 3 more Smart Citations