Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-1085
|View full text |Cite
|
Sign up to set email alerts
|

Recognition of Dysarthric Speech Using Voice Parameters for Speaker Adaptation and Multi-Taper Spectral Estimation

Abstract: Dysarthria is a motor speech disorder resulting from impairment in muscles responsible for speech production, often characterized by slurred or slow speech resulting in low intelligibility. With speech based applications such as voice biometrics and personal assistants gaining popularity, automatic recognition of dysarthric speech becomes imperative as a step towards including people with dysarthria into mainstream. In this paper we examine the applicability of voice parameters that are traditionally used for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 17 publications
(14 citation statements)
references
References 29 publications
0
14
0
Order By: Relevance
“…Several videos available on the internet were selected and visual subtitles were generated (Bhat et al, 2013b) as described earlier.…”
Section: Methodsmentioning
confidence: 99%
“…Several videos available on the internet were selected and visual subtitles were generated (Bhat et al, 2013b) as described earlier.…”
Section: Methodsmentioning
confidence: 99%
“…The main contributions of the paper are summarized below: 1) To the best of our knowledge, our novel spectro-temporal deep feature based adaptation approach is the first work to exploit auxiliary speaker embedding features in disordered speech adaptation. In contrast, prior works [10,12,13,27,[34][35][36][37][38][39] focused on feature transformation and model based adaptation. Speaker embedding features, e.g.…”
Section: Introductionmentioning
confidence: 98%
“…In [12], a combination of MLLR and MAP adaptation were used in speaker adaptive training (SAT) of SI GMM-HMM models. In [36], f-MLLR based SAT was studied. In [37], regularized speaker adaptation on Kullback-Leibler divergencebased HMMs (KL-HMMs) was conducted.…”
Section: Introductionmentioning
confidence: 99%
“…A comparative study of several types of ASR sys-tems including maximum likelihood and maximum a posteriori (MAP) adaptation showed a significant improvement in dysarthric speech recognition when speaker adaptation using MAP adaptation was applied [6]. Word error rate for dysarthric speech was reduced using voice parameters such as jitter and shimmer along with multi-taper Mel-frequency Cepstral Coefficients (MFCC) followed by speaker adaptation [7], and using Elman back-propagation network (EBN) which is a recurrent, self supervised neural network along with glottal features and MFCC in [8]. A multi-stage deep neural network (DNN) training scheme is used to better model dysarthric speech, wherein only a small amount of in-domain training data showed considerable improvement in the recognition of dysarthric speech [9].…”
Section: Introductionmentioning
confidence: 99%