2021 IEEE International Conference on Multimedia and Expo (ICME) 2021
DOI: 10.1109/icme51207.2021.9428160
|View full text |Cite
|
Sign up to set email alerts
|

Speech Synthesis of Chinese Braille with Limited Training Data

Abstract: This paper describes to our knowledge the first Chinese Braille speech synthesis system. The system consists of modules of Braille front-end processing, prosody prediction, and speech synthesis. The Braille front-end processing includes conversion from the common Braille to Pinyin, and a high-precision Chinese character prediction model. To achieve high precision prosody prediction under limited corpus conditions, we propose a prosody prediction model based on the RoBERTa pre-trained model, which achieves an a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…They claimed a high accuracy of their proposed method. A Chinese character recognition model is proposed in [30] that obtained an accuracy of 94.42%. The proposed method in [30] is extended to read the characters in natural speech.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…They claimed a high accuracy of their proposed method. A Chinese character recognition model is proposed in [30] that obtained an accuracy of 94.42%. The proposed method in [30] is extended to read the characters in natural speech.…”
Section: Introductionmentioning
confidence: 99%
“…A Chinese character recognition model is proposed in [30] that obtained an accuracy of 94.42%. The proposed method in [30] is extended to read the characters in natural speech. A combination of conventional sequence mapping method and deep learning method has been used in [31] to convert Braille characters into Hindi language, where output was generated as a speech.…”
Section: Introductionmentioning
confidence: 99%