2018
DOI: 10.1016/j.dsp.2018.06.004
|View full text |Cite
|
Sign up to set email alerts
|

Audio-visual feature fusion via deep neural networks for automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 28 publications
(8 citation statements)
references
References 14 publications
0
8
0
Order By: Relevance
“…In Reference [21], deep autoencoders are proposed to produce efficient bimodal features from the audio and visual stream inputs. The authors obtained an average relative reduction of 36.9% for a range of different noisy conditions, and also, a relative reduction of 19.2% for the clean condition in terms of the Phoneme Error Rates (PER) in comparison with the baseline method.…”
Section: Deep Learning Techniquesmentioning
confidence: 99%
“…In Reference [21], deep autoencoders are proposed to produce efficient bimodal features from the audio and visual stream inputs. The authors obtained an average relative reduction of 36.9% for a range of different noisy conditions, and also, a relative reduction of 19.2% for the clean condition in terms of the Phoneme Error Rates (PER) in comparison with the baseline method.…”
Section: Deep Learning Techniquesmentioning
confidence: 99%
“…Feature fusion method achieves good performance in processing speech data. Hasan et al [27] proposed an audio-visual feature fusion via deep neural networks and implemented speech recognition with low error rate. In addition, the audio-visual feature fusion was used to recognize lip language [28].…”
Section: ] Summarizedmentioning
confidence: 99%
“…Researchers in [12]- [14] it was shown that there is a very close relationship between the theory of error-correcting coding and the theory of neural networks, which are also often used for digital signal processing [15], [16]. This creates quite definite prospects for the practical use of multivalued logics, which are actively being developed at the present time [17], [18].…”
Section: Introductionmentioning
confidence: 99%