2007 IEEE/RSJ International Conference on Intelligent Robots and Systems 2007
DOI: 10.1109/iros.2007.4399300
|View full text |Cite
|
Sign up to set email alerts
|

Coarse speech recognition by audio-visual integration based on missing feature theory

Abstract: Audio-visual speech recognition (AVSR) is a promising approach to improve noise robustness of speech recognition in the real world. A phoneme and a viseme are used as an auditory and visual unit for AVSR, respectively. However, in the real world, they are often misclassified due to additional input noises. To solve this problem, we propose two approaches. One is audio-visual integration based on missing feature theory to cope with missing or unreliable audio and visual features for recognition. The other is a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2009
2009
2015
2015

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 20 publications
0
4
0
Order By: Relevance
“…We simply introduced our reported AVSR for robots [15] as mentioned in Section 1, because this AVSR system showed high noiserobustness to improve speech recognition even when either audio or visual information is missing and/or contaminated by noises. This kind of high performance is derived from missing feature theory (MFT) which drastically improves noise-robustness by using only reliable acoustic and visual features by masking unreliable ones out.…”
Section: The Second Layer Av Integration Blockmentioning
confidence: 99%
“…We simply introduced our reported AVSR for robots [15] as mentioned in Section 1, because this AVSR system showed high noiserobustness to improve speech recognition even when either audio or visual information is missing and/or contaminated by noises. This kind of high performance is derived from missing feature theory (MFT) which drastically improves noise-robustness by using only reliable acoustic and visual features by masking unreliable ones out.…”
Section: The Second Layer Av Integration Blockmentioning
confidence: 99%
“…We simply introduced our reported AVSR for robots [6] as mentioned in Section I, because this AVSR system showed high noise-robustness to improve speech recognition even when either audio or visual information is missing and/or contaminated by noises. This kind of high performance is derived from missing feature theory (MFT) which drastically improves noise-robustness by using only reliable acoustic and visual features by masking unreliable ones out.…”
Section: B the Second Layer Av Integration Blockmentioning
confidence: 99%
“…To solve the difficulties, we reported AVSR for robots by introducing two psychologically-inspired methods [6]. One is missing feature theory (MFT) which improves noise-robustness by using only reliable acoustic and visual features by masking unreliable ones out.…”
Section: Introductionmentioning
confidence: 99%
“…To tackle with these difficulties, we reported AVSR for a robot based on two psychologically-inspired methods [6]. One is Missing Feature Theory (MFT), which improves noise-robustness by using only reliable audio and visual features by masking unreliable ones out.…”
Section: Introductionmentioning
confidence: 99%