Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-528
|View full text |Cite
|
Sign up to set email alerts
|

Tone Classification in Mandarin Chinese Using Convolutional Neural Networks

Abstract: In tone languages, different tone patterns of the same syllable may convey different meanings. Tone perception is important for sentence recognition in noise conditions, especially for children with cochlear implants (CI). We propose a method that fully automates tone classification of syllables in Mandarin Chinese. Our model takes as input the raw tone data and uses convolutional neural networks to classify syllables into one of the four tones in Mandarin. When evaluated on syllables recorded from normal-hear… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
30
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(30 citation statements)
references
References 23 publications
0
30
0
Order By: Relevance
“…[7,8] applies DNN to tone recognition on female corpus and some good results are achieved. More recently, [9] employs Convolutional Neural Network (CNN) for speech evaluation of the hearing-impaired population. However, feedforward neural networks like DNN and CNN are not designed to model time-series so that it is difficult to handle the F0 variations especially in continuous speech.…”
Section: Introductionmentioning
confidence: 99%
“…[7,8] applies DNN to tone recognition on female corpus and some good results are achieved. More recently, [9] employs Convolutional Neural Network (CNN) for speech evaluation of the hearing-impaired population. However, feedforward neural networks like DNN and CNN are not designed to model time-series so that it is difficult to handle the F0 variations especially in continuous speech.…”
Section: Introductionmentioning
confidence: 99%
“…The highest accuracy (95.53%) was obtained using the convolutional neural network (CNN) and the mel-frequency cepstral coefficients (MFCCs) for tone pronunciation of 4500 syllables by 125 children aged 3 to 10. This study [34] showed that tone recognition by machine learning is possible; however, there are some shortcomings in learning tones with the tone recognition system. This study used children's speech, which had limitations on the pitch range, and the spectrogram and MFCCs of pronounced monosyllabic words in the paper showed that the third tone was a dipping-rising tone (Figure 4).…”
Section: Learning Mandarin Tone With Machine Learningmentioning
confidence: 98%
“…Machine learning data from previous studies on tone recognition[34]. Top: Time waveforms; Middle: Spectrograms; Bottom: Mel-frequency cepstral coefficients (MFCCs).…”
mentioning
confidence: 99%
“…Remarkably, they report that the MFCC-based recognizer handily outperforms the HDPF-based recognizer. Similarly, in [13], Chen et al train a convolutional network to take as input a window of MFCCs for a single tonal syllable and predict its tone. Although F0 can be estimated very accurately, these results show that F0-based features are not the best features for tone recognition, or at least that there is some information in the input signal that is lost when HDPFs alone are used.…”
Section: Existing Approachesmentioning
confidence: 99%