ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682826
|View full text |Cite
|
Sign up to set email alerts
|

Improving Children Speech Recognition through Feature Learning from Raw Speech Signal

Abstract: Children speech recognition based on short-term spectral features is a challenging task. One of the reasons is that children speech has high fundamental frequency that is comparable to formant frequency values. Furthermore, as children grow, their vocal apparatus also undergoes changes. This presents difficulties in extracting standard short-term spectral-based features reliably for speech recognition. In recent years, novel acoustic modeling methods have emerged that learn both the feature and phone classifie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 10 publications
(13 reference statements)
0
4
0
Order By: Relevance
“…All of these features decrease the performance of common speech recognition systems, as shown in Kennedy et al (2017). While there are efforts to increase speech recognition performance to account for children's acoustic variability, such as Dubagunta et al (2019), Gale et al (2019), Wu et al (2019), or Shivakumar and Georgiou (2020), other above-mentioned inconsistencies can still prevent speech-based systems from understanding or fulfilling the communicative intent of a child in spite of the low word recognition error rate, which could be observed in the study by Lovato et al (2019). Despite all these issues, a lot of children have access to and engage with voice assistants in their everyday lives.…”
Section: Introductionmentioning
confidence: 99%
“…All of these features decrease the performance of common speech recognition systems, as shown in Kennedy et al (2017). While there are efforts to increase speech recognition performance to account for children's acoustic variability, such as Dubagunta et al (2019), Gale et al (2019), Wu et al (2019), or Shivakumar and Georgiou (2020), other above-mentioned inconsistencies can still prevent speech-based systems from understanding or fulfilling the communicative intent of a child in spite of the low word recognition error rate, which could be observed in the study by Lovato et al (2019). Despite all these issues, a lot of children have access to and engage with voice assistants in their everyday lives.…”
Section: Introductionmentioning
confidence: 99%
“…ASR systems like Alexa, Siri, and Google should have been an asset to such growth, providing a more frictionless interaction with technology. However, children's speech recognition seems to be a tough nut to crack due to the acoustic and linguistic variability that a child's speech often brings to the table [130]. ASR for child speech is proven more challenging than that for adult speech, due to children's shorter vocal tracts, slower and more variable speaking rate and inaccurate articulation [131].…”
Section: Children-centric Focusmentioning
confidence: 99%
“…This is improvised using CNN-based end-to-end acoustic modelling methods. For feature extraction, MFCC is used in this system [22].…”
Section: Review Of Literaturementioning
confidence: 99%