2020
DOI: 10.3390/rs12040653
|View full text |Cite
|
Sign up to set email alerts
|

Non-Contact Speech Recovery Technology Using a 24 GHz Portable Auditory Radar and Webcam

Abstract: Language has been one of the most effective ways of human communication and information exchange. To solve the problem of non-contact robust speech recognition, recovery, and surveillance, this paper presents a speech recovery technology based on a 24 GHz portable auditory radar and webcam. The continuous-wave auditory radar is utilized to extract the vocal vibration signal, and the webcam is used to obtain the fitted formant frequency. The traditional formant speech synthesizer is selected to synthesize and r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…At present, the present technologies for obtaining speech signals can be divided into air conduction and non-air conduction detection. However, the shortcomings limit the development of these techniques for speech detection [1][2][3][4][5]. In recent years, bio-radar technology has been developed for using in a variety of remote sensing applications [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…At present, the present technologies for obtaining speech signals can be divided into air conduction and non-air conduction detection. However, the shortcomings limit the development of these techniques for speech detection [1][2][3][4][5]. In recent years, bio-radar technology has been developed for using in a variety of remote sensing applications [6,7].…”
Section: Introductionmentioning
confidence: 99%
“…Based on recent advances in machine learning-based technologies, the conversion of biosignals to speech signals has been reported in several studies [3], [4], [5]. Various signals have been considered for speech generation and enhancement, including surface electromyography (sEMG) [3], [6], electromagnetic articulography (EMA) [4], [7], permanent magnetic articulography (PMA) [5], [8], ultrasound tongue imaging [9], [10], Doppler signals [11], [12], visual cues [13], [14], and bone-conducted microphone signals [15]. Further, multimodal learning has been leveraged to integrate information from complementary data, such as text [16], videos [13], boneconducted microphone signals [15], and articulatory movements [4].However, the transformation of articulatory movements to facilitate communication has not yet been adequately researched.…”
mentioning
confidence: 99%