2010 IEEE International Workshop on Multimedia Signal Processing 2010
DOI: 10.1109/mmsp.2010.5662075
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal speech recognition of a person with articulation disorders using AAM and MAF

Abstract: We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. The articulation of speech tends to become unstable due to strain on speech-related muscles, and that causes degradation of speech recognition. Therefore, we use multiple acoustic frames (MAF) as an acoustic feature to solve this problem. Further, in a real environment, current speech recognition systems do not have sufficient performance due to noise influence. In addition to acoustic feature… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 20 publications
(12 citation statements)
references
References 10 publications
0
12
0
Order By: Relevance
“…In Ref. [10], we used multiple acoustic frames (MAF) as an acoustic dynamic feature to improve the recognition rate of a person with an articulation disorder, especially in speech recognition using dynamic features only.…”
Section: Related Workmentioning
confidence: 99%
“…In Ref. [10], we used multiple acoustic frames (MAF) as an acoustic dynamic feature to improve the recognition rate of a person with an articulation disorder, especially in speech recognition using dynamic features only.…”
Section: Related Workmentioning
confidence: 99%
“…This application can be launched on Android ™ cell phones and tablet computers. Here, model training and recognition methods of the recog nizer which accepts connected-digit utterances are described: In the visual modality, several features have been proposed; for example, discrete-cosine-transform results and optical flow-based parameters [3] as pixel-based features, alternatively, Active Appearance Model (AAM) parameters [5] as model based features. These features have been investigated and com pared in [13].…”
Section: B Avasr On Smart Cell Phonesmentioning
confidence: 99%
“…Face detection, that is realized as a hardware function of the camera, is conducted in each picture to obtain a face region. A monochrome captured image with face detection results are then stored into a visual Name/Issuer AT R [1] Tokyo Tech [3] Tokyo Tech [16] M2TINIT [9] Tokyo Tech [17] Tokyo Tech [18] Kobe Univ [5] CENSREC-I-AV [15] CENSREC-2-AV [19] frame. ii.…”
Section: B Avasr On Smart Cell Phonesmentioning
confidence: 99%
See 1 more Smart Citation
“…In [9], we proposed robust feature extraction based on principal component analysis (PCA) with more stable utterance data instead of DCT. In [10], we used multiple acoustic frames (MAF) as an acoustic dynamic feature to improve the recognition rate of a person with an articulation disorder, especially in speech recognition using dynamic features only. In spite of these efforts, the recognition rate for articulation disorders is still lower than that of physically unimpaired persons.…”
Section: Introductionmentioning
confidence: 99%