2009
DOI: 10.1155/2009/308340
|View full text |Cite
|
Sign up to set email alerts
|

Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers

Abstract: Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
16
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 32 publications
(17 citation statements)
references
References 24 publications
1
16
0
Order By: Relevance
“…Thus, the objective of the statistical modelling of the phoneme confusion-matrix is to estimate W from P * . This can be accomplished by the following expression [22]:…”
Section: Stimulusmentioning
confidence: 99%
See 3 more Smart Citations
“…Thus, the objective of the statistical modelling of the phoneme confusion-matrix is to estimate W from P * . This can be accomplished by the following expression [22]:…”
Section: Stimulusmentioning
confidence: 99%
“…Instead, a Hidden Markov model (HMM) can be constructed for each of the phonemes in the phoneme inventory. These HMMs, termed as metamodels [22,24], can be best understood by comparison with a "standard" acoustic HMM: a standard acoustic HMM estimates Pr(O ′ |p j ), where O ′ is a subsequence of the complete sequence of observed acoustic vectors in the utterance, O, and p j is a postulated phoneme in P. A metamodel estimates Pr(P ′ |p j ), whereP ′ is a subsequence of the complete sequence of observed (decoded) phonemes in the utteranceP.…”
Section: Metamodelsmentioning
confidence: 99%
See 2 more Smart Citations
“…Also, words with long voiceless stops can be interpreted as two words because of the long silent occlusion phase in the middle of the target word ( before → be for ) [17, 18]. …”
Section: Introductionmentioning
confidence: 99%