Improvement of Acoustic Models Fused with Lip Visual Information for Low-Resource Speech

Yu, Chongchong; Yu, Jiaqi; Qian, Zhaopeng; Tan, Yuchen

doi:10.3390/s23042071

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

2025

Publication Types

Select...

Article2

Other1

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Multiview Learning‐Based Speech Recognition for Low‐Resource Languages

Kumar,

Yadav

2024

Automatic Speech Recognition and Translation for Low Resource Languages

View full text Add to dashboard Cite

Multiview Learning‐Based Speech Recognition for Low‐Resource Languages

Kumar,

Yadav

2024

Automatic Speech Recognition and Translation for Low Resource Languages

View full text Add to dashboard Cite

An overview of high-resource automatic speech recognition methods and their empirical evaluation in low-resource environments

Fatehi,

Torres Torres,

Kucukyilmaz

2025

Speech Communication

View full text Add to dashboard Cite

Audiovisual Speech Recognition Method Based on Connectionism

Che,

Zhu,

Adetunji

et al. 2024

iam

View full text Add to dashboard Cite

Audio-visual speech recognition technology has greatly improved the performance of pure speech recognition by combining visual speech information and acoustic speech information, but there are problems such as large data demand, audio and video data alignment, and noise robustness. Scholars have proposed many solutions to these problems. Among them, deep learning algorithms, as representatives of connectionist artificial intelligence technology, have good generalization ability and portability, and are easier to migrate to different tasks and fields. They are becoming one of the mainstream technologies for audio-visual speech recognition. This paper mainly studies and analyzes the application of deep learning technology in the field of audio-visual speech recognition, especially the audio-visual speech recognition model of the end-to-end framework. Through experimental comparative analysis, relevant data sets and evaluation methods are summarized, and finally hot issues that need to be further studied and solved are proposed.

show abstract

Improvement of Acoustic Models Fused with Lip Visual Information for Low-Resource Speech

Cited by 3 publications

References 42 publications

Multiview Learning‐Based Speech Recognition for Low‐Resource Languages

Multiview Learning‐Based Speech Recognition for Low‐Resource Languages

An overview of high-resource automatic speech recognition methods and their empirical evaluation in low-resource environments

Audiovisual Speech Recognition Method Based on Connectionism

Contact Info

Product

Resources

About