Interspeech 2005 2005
DOI: 10.21437/interspeech.2005-375
|View full text |Cite
|
Sign up to set email alerts
|

A new posterior based audio-visual integration method for robust speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

1
2
0

Year Published

2008
2008
2023
2023

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(3 citation statements)
references
References 9 publications
1
2
0
Order By: Relevance
“…In this paper, we wish to carry out a complementary study in which we will compare the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits recorded in various noisy video conditions which may occur in real-world operating conditions. This work extends upon our previous research on the use of geometricbased features for audio-visual speech recognition subject to both audio and video corruptions [22].…”
Section: Introductionsupporting
confidence: 60%
“…In this paper, we wish to carry out a complementary study in which we will compare the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits recorded in various noisy video conditions which may occur in real-world operating conditions. This work extends upon our previous research on the use of geometricbased features for audio-visual speech recognition subject to both audio and video corruptions [22].…”
Section: Introductionsupporting
confidence: 60%
“…In contrast to many other strategies, such as [ 10 , 33 , 34 ], reliability-based stream integration does not suffer from wide disparities in audio and video model performance. This is greatly beneficial to our case as we wish to design a system that least avoids any performance degradation due to the inclusion of multiple streams and that ideally profits from the visual modality under all, even under clean, acoustic conditions.…”
Section: Fusion Models Furthermore Baselinesmentioning
confidence: 99%
“…In order to directly compare these values, which may be on different scales, we can normalize by converting to posterior probabilities. One method of selecting the optimal stream probability is the maximum stream posterior (MSP) method [23] that is expressed formally as follows:…”
Section: Maximum Weighted Stream Posterior Modelmentioning
confidence: 99%