2018
DOI: 10.1007/s12193-018-0267-1
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal speech recognition: increasing accuracy using high speed video data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
4

Relationship

2
8

Authors

Journals

citations
Cited by 25 publications
(12 citation statements)
references
References 33 publications
0
12
0
Order By: Relevance
“…They are created for different purposes and with different means. Works [28], [29] contain a comprehensive list and analysis of such databases from the audio-visual speech recognition point of view.…”
Section: Multimodal Corpora For Audio-visual Speech Recognition Inmentioning
confidence: 99%
“…They are created for different purposes and with different means. Works [28], [29] contain a comprehensive list and analysis of such databases from the audio-visual speech recognition point of view.…”
Section: Multimodal Corpora For Audio-visual Speech Recognition Inmentioning
confidence: 99%
“…As well as commonly used visual features extraction algorithms and visual speech modeling methods. In addition, we started with investigation of region-of-interest (ROI) detection approaches (Ivanko et al, 2018a). We found out that Active Appearance Models-based and Haar-like features-based methods most widely used for this purpose.…”
Section: Backgrounds and Related Researchmentioning
confidence: 99%
“…There are several state-of-the-art methods for model training. Initially, the most widespread methods were based on the use of hidden Markov models (HMM) for visual speech recognition and their coupled or multistream versions for audio-visual speech recognition [23]. However, at present, the approaches based on the use of neural networks of different architectures have become increasingly popular [24].…”
Section: Backgroundsmentioning
confidence: 99%