2019
DOI: 10.30534/ijatcse/2019/27852019
|View full text |Cite
|
Sign up to set email alerts
|

An Assessment of the Visual Features Extractions for the Audio-Visual Speech Recognition

Abstract: Utilization of the visual data from the speakers mouth region has appeared to develop presentation of the Automatic Speech-Recognition ASR frameworks. This is the particularly valuable in nearness of the clamor, which uniform in the moderate structure seriously debases discourse acknowledgment execution of frameworks utilizing just sound data. Different arrangements of highlights separated from speakers mouth area have been utilized to improve the showing of an ASR framework. In such testing situations and hav… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…Various examinations have uncovered that the data contained in discourse sign is firmly identified with that found in lip moments and if data in regards to the last is incorporated the observation execution of the two people and machines can be improved. In uproarious situations, people can diminish discourse acknowledgment blunders by utilizing the speakers lip moments [1] and to be sure numerous individuals with hearing challenges depend on lip perusing to give most of the discourse data they get. There are two fundamental issues that should be tended to in planning and actualizing a lip-reading framework and the first decision of visual features while the second to the improvement of an incredible procedure for the features extractions from the video streams.…”
Section: Introductionmentioning
confidence: 99%
“…Various examinations have uncovered that the data contained in discourse sign is firmly identified with that found in lip moments and if data in regards to the last is incorporated the observation execution of the two people and machines can be improved. In uproarious situations, people can diminish discourse acknowledgment blunders by utilizing the speakers lip moments [1] and to be sure numerous individuals with hearing challenges depend on lip perusing to give most of the discourse data they get. There are two fundamental issues that should be tended to in planning and actualizing a lip-reading framework and the first decision of visual features while the second to the improvement of an incredible procedure for the features extractions from the video streams.…”
Section: Introductionmentioning
confidence: 99%